BiocBot is an AI-powered study assistant platform that enables students to interact with course material in a chat-based format. Instructors can upload documents (PDFs, DOCX, or TXT), which are automatically parsed, chunked, and embedded into a vector database (Qdrant) for semantic search. When a student asks a question, the system retrieves relevant chunks and generates a response grounded in course content.
- Document Management: Upload and organize course materials per lecture/unit
- Vector Search: Semantic search across documents using Qdrant
- AI Chat Interface: RAG-powered student chat with tutor and protege modes
- Per-Course Retrieval Mode: Instructor-controlled additive vs single-unit retrieval for chat
- Quiz Practice System: Self-paced AI-graded quizzes with attempt history
- Assessment Questions: Create and manage multiple-choice, true/false, and short-answer questions
- Flagging System: Students flag issues with questions; instructors review and respond
- Student Struggle Tracking: Activity logging to monitor and surface struggling students
- Course Structure: Organize content by units/lectures with publish controls
- User Management: Separate interfaces for instructors, TAs, and students
- TA Management: Instructors promote students to TAs with scoped permissions
- Onboarding Wizard: Guided AI-assisted course setup for instructors
- SAML / UBC CWL Auth: Shibboleth integration alongside local username/password auth
- User Agreement: Modal-gated terms acceptance before platform access
- Session Idle Timeout: Automatic logout after inactivity
BiocBot follows a split architecture with a public frontend and a private backend, adhering to clear separation of concerns for maintainability and security.
- Frontend: HTML + Vanilla JS (no frameworks), styled via separate CSS files
- Backend: Node.js (Express 5), built with modular architecture
- Database: MongoDB (documents, user sessions, analytics, quiz attempts)
- Vector Database: Qdrant for semantic search and similarity retrieval
- Embeddings: UBC GenAI Toolkit with OpenAI (text-embedding-3-small)
- Authentication: Passport.js β local strategy + SAML / UBC Shibboleth
- Node.js v18.x or higher
- MongoDB instance
- Qdrant vector database (Docker recommended)
- OpenAI API key
git clone <repository-url>
cd tlef-biocbot
npm installCreate a .env file in the root directory with the following variables:
# MongoDB Connection
MONGO_URI=mongodb://localhost:27017/biocbot
# Server Port
TLEF_BIOCBOT_PORT=8080
# Qdrant Configuration
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=super-secret-dev-key
# Embeddings Provider Configuration
EMBEDDING_PROVIDER=ubc-genai-toolkit-llm
# LLM Provider Settings
LLM_PROVIDER=openai
LLM_API_KEY=your-openai-api-key
OPENAI_MODEL=gpt-4.1-mini
LLM_EMBEDDING_MODEL=text-embedding-3-smalldocker run -p 6333:6333 qdrant/qdrantnpm run dev- Access: Navigate to
/instructor - Onboarding: Complete the guided course setup wizard (AI-assisted topic extraction)
- Upload Documents: Add course materials to units/lectures
- Create Questions: Build multiple-choice, true/false, and short-answer assessments
- Publish Units: Make content available to students
- Quiz Settings: Enable quiz practice, select testable units, and control material access for failed answers
- Retrieval Mode: On the course Home page, toggle "Use additive retrieval" to allow chat to include earlier published units in addition to the selected unit. When off, chat uses only the selected unit.
- Manage TAs: Promote students to TAs via the TA Hub; assign course and flag permissions
- Review Flags: View and respond to student-flagged question issues
- Monitor Students: Use the Student Hub to review engagement and struggle activity
- Access: Navigate to
/student - Agreement: Accept the user agreement on first login
- Course Selection: Choose your course
- Chat Interface: Select a unit, then ask questions about course material
- Quiz Practice: Practice assessment questions with immediate AI feedback and attempt history
- Flag Questions: Report unclear or incorrect questions for instructor review
- Chat History: Review past conversations
- Access: Navigate to
/ta - Onboarding: Complete TA onboarding
- Settings: Configure TA-specific options
- Flagged Questions: Review and respond to flagged questions (if permitted)
BiocBot uses Qdrant for vector-based semantic search:
- Automatic Document Processing: Documents are automatically chunked, embedded, and stored on upload
- Semantic Search: Find relevant content using natural language queries
- Course-Aware Search: Filter results by course and lecture
- Real-time Indexing: New documents are immediately searchable
GET /api/qdrant/statusβ Check Qdrant service statusPOST /api/qdrant/process-documentβ Process and store a documentPOST /api/qdrant/searchβ Semantic search across documentsDELETE /api/qdrant/document/:idβ Delete document chunksGET /api/qdrant/collection-statsβ Get collection statistics
Visit /qdrant-test to test the Qdrant functionality interactively.
tlef-biocbot/
βββ public/ # Frontend assets
β βββ common/
β β βββ scripts/ # Shared scripts (auth, login, idle-timer, etc.)
β βββ instructor/ # Instructor interface
β β βββ scripts/ # home, settings, onboarding, ta-hub, student-hub, ...
β β βββ *.html
β βββ student/ # Student interface
β β βββ scripts/ # dashboard, quiz, history, flagged, ...
β β βββ *.html
β βββ ta/ # TA interface
β β βββ scripts/
β β βββ *.html
β βββ qdrant-test.html # Qdrant testing page
βββ src/ # Backend source
β βββ config/ # Passport, app config
β βββ middleware/ # Auth middleware (requireAuth, requireRole, etc.)
β βββ models/ # MongoDB models
β βββ routes/ # API route handlers
β βββ services/ # Business logic (LLM, Qdrant, auth, tracker)
β βββ server.js # Main server entry point
βββ documents/ # Project documentation
| Model | Collection | Purpose |
|---|---|---|
Course |
courses |
Course metadata, lecture structure, quiz settings |
User |
users |
Accounts, roles, preferences, struggle state |
Document |
documents |
Uploaded files and parsed content |
Question |
embedded in Course | MC, TF, and short-answer questions per lecture |
QuizAttempt |
quizAttempts |
Per-student quiz attempt records |
FlaggedQuestion |
flaggedQuestions |
Student-reported question issues |
StruggleActivity |
struggleActivity |
Student struggle state transitions |
UserAgreement |
useragreements |
Terms acceptance records |
- LLMService (
src/services/llm.js): AI chat responses and short-answer evaluation via UBC GenAI Toolkit - QdrantService (
src/services/qdrantService.js): Vector DB indexing and semantic search - AuthService (
src/services/authService.js): User registration, login, preferences - TrackerService (
src/services/tracker.js): Student engagement and struggle tracking - prompts (
src/services/prompts.js): System prompt management (base, tutor, protege, quizHelp modes)
requireAuthβ Must be logged inrequireStudent/requireInstructor/requireInstructorOrTAβ Role-based accessrequireStudentEnrolledβ Must be enrolled in the requested courserequireTAPermission(permission)β TA-scoped permission checks
npm test # all Playwright tests, headless
npm run test:headed # run with a visible browser
npm run test:ui # Playwright UI mode
npm run test:report # open the last HTML reportThe Playwright config (playwright.config.js) launches its own server with BIOCBOT_TEST_LLM_STUB=1, so the LLM and embeddings calls are intercepted by deterministic stubs (src/services/llmStub.js, src/services/embeddingsStub.js). You do not need an OpenAI key to run tests β but you still need MongoDB and Qdrant reachable at the URLs in your .env.
A GitHub Actions workflow at .github/workflows/playwright.yml runs the full Playwright suite on every push to main and on every pull request targeting main.
- Boots
mongo:7andqdrant/qdrant:latestas service containers inside the runner. - Installs Node 20 and project dependencies.
- Installs the Chromium browser via
npx playwright install --with-deps chromium. - Runs
npm testwithBIOCBOT_TEST_LLM_STUB=1so no external LLM calls are made. - Uploads the Playwright HTML report, Monocart report, coverage reports, and (on failure) traces/videos/screenshots as workflow artifacts.
The workflow is plain YAML β pushing the file to GitHub is enough to register it. No extra configuration is required for the default case because:
- MongoDB and Qdrant run as ephemeral service containers (no external DB needed).
- The LLM stub means no API keys / secrets need to be configured.
- All required env vars are inlined in the
env:block of the workflow.
Steps to enable:
- Push this branch (which includes
.github/workflows/playwright.yml) to GitHub. - Open the repository's Actions tab on github.com. If Actions are disabled at the org level, an admin must enable them under Settings β Actions β General β Allow all actions.
- The workflow will run automatically on the next push or pull request. You can also trigger a run manually from the Actions tab if you add a
workflow_dispatch:trigger.
- Go to Actions β Playwright Tests β (latest run).
- Scroll to the Artifacts section at the bottom to download:
playwright-reportβ standard Playwright HTML reportmonocart-reportβ Monocart report with coveragecoverage-reportsβ raw v8/lcov coveragetest-resultsβ traces, videos, screenshots (only uploaded on failure)
- Unzip and open
index.htmlfrom any of the reports locally.
- Different Node version: change
node-version: 20in the workflow. - Switch from
npm installtonpm ci: commitpackage-lock.json(currently in.gitignore), then change the install step and re-enablecache: npmon thesetup-nodeaction. - Add a manual trigger: add
workflow_dispatch:under the top-levelon:block. - Run on more branches: extend the
branches:lists underpush:andpull_request:.
ISC License