Skip to content

Fix arxivbot workflow: guard against non-BibTeX DOI responses and shell injection#1107

Merged
pancetta merged 1 commit into
sourcefrom
copilot/fix-ci-failure
Jun 15, 2026
Merged

Fix arxivbot workflow: guard against non-BibTeX DOI responses and shell injection#1107
pancetta merged 1 commit into
sourcefrom
copilot/fix-ci-failure

Conversation

Copilot AI commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

The arxiv_to_publications_correct workflow still failed after #1106 because two issues remained unaddressed.

Bugs fixed

Unhandled IndexError on unexpected DOI responses

When a DOI resolves to a 200 OK but returns non-BibTeX content (HTML redirect, plain-text error), bib.split("{") yields an empty rest1, and rest1[0] crashes with IndexError — exiting the script with code 1 before any output is captured.

Before:

bib = req.content.decode()  # blindly trusts content is BibTeX
# ...
bType, *rest1 = bib.split("{")
oldID, *rest2 = rest1[0].split(",")  # IndexError if no '{' in response
db.entries.remove(entries[id_db])    # already removed — database now corrupt

After:

# Validate BibTeX before touching the database
try:
    bib = bibtex_req.text
    bType, *rest1 = bib.split("{")
    if not rest1:
        print(f'Ignoring {id_db}, DOI did not return valid BibTeX ...')
        continue
    # ... parse and validate new_entries ...
except Exception as exc:
    print(f'Ignoring {id_db}, error processing BibTeX from DOI: {exc}')
    continue
# Only mutate db after confirmed valid replacement
db.entries.remove(entries[id_db])
db.entries.extend(new_entries)

db.entries.remove is now deferred until a valid replacement is confirmed, preventing database corruption on partial failure.

Shell injection via ${{ github.event.issue.body }}

Inlining the issue body directly into the shell command is unsafe if any paper title contains " or backticks.

# Before — unsafe template expansion into shell string
python3 arxiv_to_publications_correct.py -b "${{ github.event.issue.body }}"

# After — body passed via env variable, shell-safe expansion
env:
  ISSUE_BODY: ${{ github.event.issue.body }}
run: |
  python3 arxiv_to_publications_correct.py -b "$ISSUE_BODY"

Minor

  • requests added explicitly to pip installs (was implicitly available on runners).
  • Two response objects renamed (bibtex_req / meta_req) to avoid reuse confusion.

Copilot AI changed the title Fix CI: handle invalid BibTeX responses and prevent shell injection Fix arxivbot workflow: guard against non-BibTeX DOI responses and shell injection Jun 15, 2026
Copilot AI requested a review from pancetta June 15, 2026 06:12
@pancetta pancetta marked this pull request as ready for review June 15, 2026 06:16
@pancetta pancetta merged commit 0a5cc09 into source Jun 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants