Skip to content

Ndevu12/Research_Assistant_Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

37 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Research Assistant

A local-first research pipeline that retrieves academic papers from multiple scholarly APIs, ranks and clusters them with embeddings, synthesizes cross-paper insights, and exports reports in several formats. Uses Ollama by default for fully local LLM inference, with optional OpenAI and Anthropic providers.

Built with Python 3.13, pydantic-ai, sentence-transformers, and async I/O.

Documentation: https://ndevu12.github.io/Research_Assistant_Model/ β€” architecture, configuration, API, operations, and known issues.

Features

  • Multi-stage pipeline β€” query understanding β†’ expansion β†’ retrieval β†’ deduplication β†’ ranking β†’ clustering β†’ synthesis β†’ gap analysis β†’ citation export β†’ report generation
  • Local-first LLM β€” Ollama with resource-aware model auto-selection (llama3.1:8b or llama3.2:3b)
  • Cloud LLM support β€” OpenAI and Anthropic via src/models/ provider abstraction
  • Multi-source retrieval β€” OpenAlex, Semantic Scholar (arXiv, CrossRef, and others configurable)
  • Embedding-backed stages β€” sentence-transformers (bge-small-en-v1.5) for dedup, ranking, and clustering
  • Structured output β€” Pydantic models throughout; JSON, Markdown, HTML, and print-ready PDF (HTML)
  • Citation export β€” BibTeX, APA, MLA, Chicago
  • Session memory β€” optional SQLite-backed interactive sessions
  • Auto-setup β€” installs dependencies, Ollama, and pulls the configured local model on first run

Requirements

  • Python 3.13+
  • Pipenv for dependency management
  • Internet connection (API retrieval; optional for fully offline LLM after model download)

Local LLM RAM (approximate):

Model RAM Disk
llama3.2:3b 4–6 GB ~2.5 GB
llama3.1:8b 8–10 GB ~5 GB

Cloud providers require only an API key β€” no Ollama install.

Quick Start

pip install pipenv
pipenv install
cp .env.example .env          # optional; edit as needed
pipenv run python -m src "transformer attention mechanisms"

On first run with the default Ollama provider, the assistant will:

  1. Check Python and embedding dependencies
  2. Install or start Ollama if needed
  3. Resolve your target model (auto, env override, or config)
  4. Pull the model if it is not already installed
  5. Run the research pipeline

Use Pipenv for all commands (pipenv run python -m src). Running plain python -m src may miss dependencies such as sentence-transformers.

While a query runs, the CLI streams live progress to stderr: pipeline stage checkmarks, sub-activities (e.g. β€œAnalyzing paper 2/5”), and AI token previews during LLM calls. Disable with --no-progress or RA_PIPELINE__STREAM_PROGRESS=false.

Usage

Command line

# Interactive mode
pipenv run python -m src

# Single query (markdown output)
pipenv run python -m src "your research query"

# HTML report saved to file
pipenv run python -m src --format html -o reports/report.html "your query"

# Print-ready PDF (HTML β€” open in browser β†’ Print β†’ Save as PDF)
pipenv run python -m src --format pdf -o reports/report.pdf.html "your query"

# JSON output with citation exports
pipenv run python -m src --format json --export bibtex,apa "your query"

# Session memory in batch mode
pipenv run python -m src --session "your query"

Setup & health check

pipenv run python -m setups.health_check
pipenv run python -m setups.manager                  # auto-select model
pipenv run python -m setups.manager --model llama3.1:8b

See setups/README.md for setup details.

Configuration

Configuration is layered (highest precedence first):

  1. Environment variables (RA_* prefix, nested with __)
  2. Project .env file
  3. YAML files in config/ (default.yaml, models.yaml, ranking.yaml, providers.yaml)
  4. Code defaults

Copy .env.example to .env to get started.

Environment variables

Retrieval APIs

Variable Required Description
S2_API_KEY No Semantic Scholar API key (higher rate limits)
RA_CROSSREF_MAILTO If CrossRef enabled Email for CrossRef polite pool
CROSSREF_MAILTO If CrossRef enabled Alias for CrossRef mailto

LLM β€” shared settings

All providers use the RA_LLM__* namespace. API keys can also be set via provider-specific env vars (see below) or the unified RA_LLM__API_KEY.

Variable Default Description
RA_LLM__PROVIDER ollama ollama, openai, or anthropic
RA_LLM__MODEL auto Model name, or auto for resource-based selection (Ollama only)
RA_LLM__BASE_URL provider-specific API base URL
RA_LLM__API_KEY β€” Unified API key override for any provider
RA_LLM__TEMPERATURE 0.2 Sampling temperature
RA_LLM__TIMEOUT_SECONDS 120 Request timeout

LLM β€” Ollama (default)

Variable Default Description
RA_LLM__PROVIDER ollama Use local Ollama server
RA_LLM__MODEL auto auto, llama3.1:8b, llama3.2:3b, etc. (see config/ollama_models.yaml)
RA_LLM__BASE_URL http://localhost:11434/v1 Ollama OpenAI-compatible endpoint
RA_LLM__API_KEY ollama Placeholder key (Ollama ignores it)
OLLAMA_API_KEY β€” Alternative to RA_LLM__API_KEY

Model selection: Set RA_LLM__MODEL=auto to pick the best model for your RAM/disk. Override with a specific model name in .env (e.g. RA_LLM__MODEL=llama3.1:8b). Supported models are listed in config/ollama_models.yaml. Setup pulls missing models automatically on startup.

LLM β€” OpenAI

Variable Required Description
RA_LLM__PROVIDER Yes Set to openai
RA_LLM__MODEL Yes e.g. gpt-4o-mini
OPENAI_API_KEY Yes* OpenAI API key
RA_LLM__API_KEY Yes* Alternative unified key
RA_LLM__BASE_URL No Custom endpoint (defaults to https://api.openai.com/v1; use for LM Studio and other OpenAI-compatible servers)

* One of OPENAI_API_KEY or RA_LLM__API_KEY is required.

RA_LLM__PROVIDER=openai
RA_LLM__MODEL=gpt-4o-mini
OPENAI_API_KEY=sk-...
RA_SYNTHESIS__LLM_ENABLED=true

LLM β€” Anthropic

Variable Required Description
RA_LLM__PROVIDER Yes Set to anthropic
RA_LLM__MODEL Yes e.g. claude-3-5-haiku-latest
ANTHROPIC_API_KEY Yes* Anthropic API key
RA_LLM__API_KEY Yes* Alternative unified key
RA_LLM__BASE_URL No Custom Anthropic-compatible endpoint

* One of ANTHROPIC_API_KEY or RA_LLM__API_KEY is required.

RA_LLM__PROVIDER=anthropic
RA_LLM__MODEL=claude-3-5-haiku-latest
ANTHROPIC_API_KEY=sk-ant-...
RA_SYNTHESIS__LLM_ENABLED=true

Pipeline & synthesis

Variable Default Description
RA_SYNTHESIS__LLM_ENABLED false Enable LLM-based synthesis (recommended for 8B+ local or cloud models)
RA_RANKING__TOP_K 25 Papers kept after ranking
RA_PIPELINE__DEBUG false Verbose pipeline logging
RA_PIPELINE__STREAM_PROGRESS true Live stage/LLM progress on stderr (TTY only)
RA_DEBUG β€” Alias for debug mode (1, true, yes)
RA_CONFIG_DIR β€” Override path to config/ directory

Provider implementations live in src/models/ (ollama.py, openai.py, anthropic.py). Each resolves API keys via RA_LLM__API_KEY first, then the provider-specific env var.

Project Structure

Research_Assistant_Model/
β”œβ”€β”€ config/                     # YAML configuration (merged at runtime)
β”‚   β”œβ”€β”€ default.yaml            # Base settings
β”‚   β”œβ”€β”€ models.yaml             # LLM provider overrides
β”‚   β”œβ”€β”€ ollama_models.yaml      # Supported local models + RAM/disk requirements
β”‚   β”œβ”€β”€ ranking.yaml            # Ranking weights and top-k
β”‚   └── providers.yaml          # Retrieval provider toggles
β”‚
β”œβ”€β”€ src/                        # Application source
β”‚   β”œβ”€β”€ __main__.py             # CLI entry point (`python -m src`)
β”‚   β”œβ”€β”€ config/                 # AppSettings, model auto-selection
β”‚   β”œβ”€β”€ core/                   # Pipeline engine, stage recovery, metrics
β”‚   β”œβ”€β”€ retrieval/              # API clients, providers, deduplication
β”‚   β”‚   β”œβ”€β”€ providers/          # OpenAlex, Semantic Scholar, arXiv, …
β”‚   β”‚   β”œβ”€β”€ orchestrator.py     # Pipeline facade
β”‚   β”‚   └── models.py           # RetrievedPaper, ResearchReport, …
β”‚   β”œβ”€β”€ research/               # Query expansion, ranking, clustering
β”‚   β”œβ”€β”€ analysis/               # Synthesis, gap analysis
β”‚   β”œβ”€β”€ embeddings/             # sentence-transformers + disk cache
β”‚   β”œβ”€β”€ models/                 # LLM providers
β”‚   β”‚   β”œβ”€β”€ ollama.py           # Local Ollama (default)
β”‚   β”‚   β”œβ”€β”€ openai.py           # OpenAI / compatible endpoints
β”‚   β”‚   β”œβ”€β”€ anthropic.py        # Anthropic Claude
β”‚   β”‚   └── factory.py          # AgentFactory + provider registry
β”‚   β”œβ”€β”€ reporting/              # Markdown, HTML, JSON renderers
β”‚   β”œβ”€β”€ export/                 # BibTeX, APA, MLA, Chicago
β”‚   β”œβ”€β”€ memory/                 # SQLite session store
β”‚   └── utils/                  # Logging, retry, response handling
β”‚
β”œβ”€β”€ setups/                     # Install & health-check scripts
β”‚   β”œβ”€β”€ manager.py              # Full setup orchestrator
β”‚   β”œβ”€β”€ ollama.py               # Ollama install + model pull
β”‚   └── health_check.py         # Validate deps, Ollama, model
β”‚
β”œβ”€β”€ tests/                      # Unit and integration tests
β”œβ”€β”€ reports/                    # Generated reports (gitignored)
β”œβ”€β”€ data/                       # Embeddings cache, SQLite DB (gitignored)
β”œβ”€β”€ logs/                       # Structured logs (gitignored)
β”œβ”€β”€ .env                        # Local secrets (gitignored; see .env.example)
β”œβ”€β”€ Pipfile / Pipfile.lock
└── README.md

Architecture

End-to-end pipeline

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                              User Query / CLI                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β”‚
                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         QUERY UNDERSTANDING & EXPANSION                     β”‚
β”‚   Parse intent Β· generate search variants Β· optional LLM query expansion    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β”‚
                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         PARALLEL PAPER RETRIEVAL                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚  OpenAlex  β”‚  β”‚ Semantic Scholar β”‚  β”‚  arXiv  β”‚  β”‚ CrossRef β”‚  …         β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β”‚
                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              EMBEDDING-BACKED PROCESSING  (bge-small-en-v1.5)               β”‚
β”‚   Deduplication  β†’  Ranking  β†’  Relevance Scoring  β†’  Clustering            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β”‚
                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                              ANALYSIS LAYER                                 β”‚
β”‚   Synthesis (heuristic or LLM)  β†’  Gap Analysis  β†’  Citation Export         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β”‚
                                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                            REPORT GENERATION                                β”‚
β”‚         Markdown  Β·  JSON  Β·  HTML  Β·  PDF-ready HTML  Β·  BibTeX/APA/…      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

LLM provider layer

The analysis stages call one backend selected via RA_LLM__PROVIDER:

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚         src/models/ (factory)        β”‚
                    β”‚   AgentFactory Β· pydantic-ai agents  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚                         β”‚                         β”‚
              β–Ό                         β–Ό                         β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚      Ollama        β”‚   β”‚      OpenAI        β”‚   β”‚     Anthropic      β”‚
   β”‚  (local default)   β”‚   β”‚  gpt-4o-mini, …    β”‚   β”‚  claude-3-5-…      β”‚
   β”‚                    β”‚   β”‚                    β”‚   β”‚                    β”‚
   β”‚ RA_LLM__MODEL=auto β”‚   β”‚ OPENAI_API_KEY     β”‚   β”‚ ANTHROPIC_API_KEY  β”‚
   β”‚ ollama_models.yaml β”‚   β”‚ RA_LLM__API_KEY    β”‚   β”‚ RA_LLM__API_KEY    β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

First-run setup (Ollama)

When RA_LLM__PROVIDER=ollama, startup runs this automatically if anything is missing:

pipenv run python -m src
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     no      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Ollama installed? │────────────►│ Install Ollama (setups/)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚ yes                                 β”‚
          β–Ό                                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     no      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Ollama running?  │────────────►│ Start ollama serve         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚ yes                                 β”‚
          β–Ό                                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     no      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Model installed? │────────────►│ ollama pull <resolved>     β”‚
β”‚  (from .env/auto) β”‚             β”‚ e.g. llama3.1:8b / 3b      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚ yes                                 β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β–Ό
                   Run research pipeline

Cloud providers (openai, anthropic) skip Ollama setup and use API keys directly.

Detailed pipeline flow (Mermaid β€” renders on GitHub)
flowchart TD
    Q[User Query] --> QU[Query Understanding]
    QU --> QE[Query Expansion]
    QE --> R[Parallel Retrieval]
    R --> OA[OpenAlex]
    R --> SS[Semantic Scholar]
    R --> AX[arXiv / CrossRef / …]
    OA --> DEDUP[Deduplication]
    SS --> DEDUP
    AX --> DEDUP
    DEDUP --> RANK[Ranking]
    RANK --> REL[Relevance Scoring]
    REL --> CLU[Clustering]
    CLU --> SYN[Synthesis]
    SYN --> GAP[Gap Analysis]
    GAP --> CIT[Citation Export]
    CIT --> REP[Report Generation]
    REP --> MD[Markdown]
    REP --> JSON[JSON]
    REP --> HTML[HTML / PDF-ready]

    subgraph LLM["LLM backend (RA_LLM__PROVIDER)"]
        OLL[Ollama]
        OAI[OpenAI]
        ANT[Anthropic]
    end

    QE -.-> LLM
    SYN -.-> LLM
    GAP -.-> LLM
Loading

Configuration precedence

  Highest ───────────────────────────────────────────────► Lowest

  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  RA_* env   β”‚ β†’ β”‚   .env      β”‚ β†’ β”‚ config/*.yamlβ”‚ β†’ β”‚   defaults  β”‚
  β”‚  (shell)    β”‚   β”‚   file      β”‚   β”‚  (merged)    β”‚   β”‚  (in code)  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Output Format

Reports include query summary, ranked papers, synthesis themes, gap analysis, and citations.

Example (markdown excerpt):

# Research Report: transformer attention mechanisms

## Executive Summary
Cross-paper synthesis highlights scaled dot-product attention, multi-head variants,
and efficiency techniques for long-context models.

## Thematic Clusters

### 1. Core Attention Architectures
2023 | NeurIPS
DOI: https://doi.org/10.xxxx/xxxxx

Key findings:
- Multi-head attention improves representational capacity
- FlashAttention reduces memory bandwidth bottlenecks

## Research Gaps
- Limited benchmarks on edge-device deployment
- Under-explored sparse attention for retrieval-augmented pipelines

Export formats:

--format Output
markdown Terminal / stdout (default)
json Structured EnhancedResearchReport JSON
html Styled HTML report
pdf Print-ready HTML (open β†’ Print β†’ Save as PDF)

Use --export bibtex,apa,mla,chicago alongside any format for citation files.

Setup / Ollama

pipenv run python -m setups.health_check
pipenv run python -m setups.manager
  • Model not installed β€” startup auto-pulls the resolved model; or run ollama pull <model> manually
  • Wrong model β€” set RA_LLM__MODEL in .env or use --model with setup
  • Ollama not running β€” ollama serve or re-run setup

Cloud providers

  • Set RA_LLM__PROVIDER to openai or anthropic and provide the API key
  • Ollama setup is skipped automatically for non-Ollama providers
  • Enable RA_SYNTHESIS__LLM_ENABLED=true for LLM-based synthesis

Missing embeddings / import errors

Always use Pipenv:

pipenv install
pipenv run python -m src

Logs

tail -f logs/combined_*.log

Development

Running locally

pipenv install --dev
pipenv run pytest
pipenv shell
python -m src

Import paths

Within the package (relative imports):

# In src/retrieval/openalex.py
from .models import RetrievedPaper

# In src/analysis/synthesis.py
from ..models import AgentFactory, AgentRole
from ..retrieval.models import RankedPaper

From external scripts (absolute imports):

from src.retrieval.orchestrator import run_research_helper
from src.models import AgentFactory, create_llm_provider
from setups import run_setup, print_report

Core dependencies

Package Role
pydantic-ai LLM agents with structured outputs
aiohttp Async HTTP for scholarly APIs
sentence-transformers Embeddings for dedup, ranking, clustering
pydantic / pydantic-settings Schemas and configuration
hdbscan Thematic paper clustering

About

A Local AI Research Agent using Llama 3.2 and the PydanticAI framework

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors