Skip to content

Expose df in Rubin pooling + support varying d.f.#562

Open
munoztd0 wants to merge 1 commit into
openpharma:mainfrom
munoztd0:junco_rbmi_pool
Open

Expose df in Rubin pooling + support varying d.f.#562
munoztd0 wants to merge 1 commit into
openpharma:mainfrom
munoztd0:junco_rbmi_pool

Conversation

@munoztd0

@munoztd0 munoztd0 commented Jun 1, 2026

Copy link
Copy Markdown

pool_internal.rubin() now appends df to each per-parameter result, making it accessible to downstream callers without re-computing it.

The strict requirement that all per-imputation degrees of freedom be identical is replaced with a median fallback: when d.f. are constant the behaviour is unchanged; when they vary (as occurs with MMRM analysis functions, where each imputed dataset yields slightly different residual d.f.) the median is used as v_com in Rubin's rules rather than throwing an error.

Both changes should low-risk: the constant-d.f. path is unaffected, and the median fallback is the standard pragmatic choice

@munoztd0

munoztd0 commented Jun 1, 2026

Copy link
Copy Markdown
Author

In accord to our previous conservation #560 with @danielinteractive and @luwidmer

@munoztd0

munoztd0 commented Jun 1, 2026

Copy link
Copy Markdown
Author

fix johnsonandjohnson/junco#369

@munoztd0 munoztd0 marked this pull request as draft June 11, 2026 09:50
@munoztd0 munoztd0 marked this pull request as ready for review June 11, 2026 09:50
@danielinteractive

Copy link
Copy Markdown
Collaborator

Hi @luwidmer , what are your thoughts on this one? 😃

@luwidmer

Copy link
Copy Markdown
Collaborator

Thank you for flagging this again @danielinteractive, I was out of office. Will take a look over the next days @munoztd0

@tobiasmuetze

Copy link
Copy Markdown

It would be good to see a reference for the statement "median fallback is the standard pragmatic choice". Such a decision would need to be thoroughly documented and also highlighted in the methods vignette.

@luwidmer luwidmer left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielinteractive @munoztd0:

  • I agree with @tobiasmuetze here RE the statement "median fallback is the standard pragmatic choice". I would also like to see references for this added to the documentation.

In addition:

  • The old code threw a clear error when dfs varied. Now it silently proceeds with median(dfs). This can silently introduce unexpected behavior in case a user relied on this error. From a software engineering standpoint I don't think this is desirable.
  • This PR introduces test failures, which would need to be addressed.
  • If the new median df is indeed desirable, the behavior there should have tests as well.
  • pool_internal.jackknife(), pool_internal.bootstrap(), and pool_internal.bmlmi() all return the parametric_ci() list without $df. Only pool_internal.rubin() now appends it. This inconsistency means downstream code cannot reliably access $df without first checking which method was used. If df should be exposed, one should consider to do this as consistently as possible (and/or as_data_frame_internal() should be updated to include it).
  • as.data.frame.pool() won't surface the new df. The as_data_frame_internal() function extracts est, se, ci, pvalue but not df. If the goal is to expose df to downstream callers, it should appear in the data frame representation too, which is the primary user-facing output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants