NanoDAPFAM
Overview
NanoDAPFAM is the compact Nano set for DAPFAM, a domain-aware patent-family retrieval benchmark. It evaluates citation-linked prior-art retrieval at the patent-family level. The group contains eighteen variants formed by three domain conditions (All, In, and Out), two query representations (title-abstract or title-abstract-claims), and three target representations (title-abstract, title-abstract-claims, or full text).
The group is useful because it separates ordinary lexical patent similarity from cross-domain prior-art retrieval. Same-domain and all-domain variants give retrievers many technical anchors: components, materials, methods, and claim phrases. OUT-domain variants remove shared IPC3 technical classes, so the model must retrieve cited families related by transferable mechanisms or problem-solution patterns rather than by the same surface vocabulary. BM25, dense retrieval, and reranking_hybrid all reveal different parts of that domain gap.
What This Group Measures
DAPFAM: A Domain-Aware Family-level Dataset to benchmark cross domain patent retrieval introduces a patent-family retrieval benchmark with citation-based relevance judgments and explicit domain partitions. The source benchmark aggregates patents at family level to reduce international duplicate publications and uses IPC3 overlap to distinguish same-domain from cross-domain retrieval.
NanoDAPFAM preserves the DAPFAM design in compact 200-query splits. Each split uses the same high-level prior-art retrieval task but changes the domain condition and patent text fields. This makes the group a controlled probe of three factors: whether positives are same-domain or cross-domain, whether the query includes claims, and whether the target is a compact summary or a very long patent text.
Task Families
- All-domain prior-art retrieval:
Allvariants include both same-IPC3 and cross-IPC3 positives, measuring overall citation-linked patent-family retrieval. - In-domain prior-art retrieval:
Invariants keep same-domain positives, where shared technical vocabulary is often available. - Out-domain prior-art retrieval:
Outvariants keep cross-domain positives, where relevance depends more on transferred mechanisms, materials, or problem-solution analogies. - Representation comparisons: each domain condition is evaluated with title-abstract and title-abstract-claims queries against title-abstract, title-abstract-claims, and full-text targets.
Dataset Shape
NanoDAPFAM contains 18 task pages, 3,600 queries, 180,000 split-local documents, and 49,879 positive qrel rows. Each split has 200 queries and 10,000 candidate documents. All variants are multi-positive, but density changes by domain: All-domain variants average about 20 positives per query, In-domain variants about 15, and Out-domain variants about 6.
Text length is a core variable. Title-abstract queries average under 800 characters, while title-abstract-claims queries range from about 8,300 to 9,300 characters. Target documents range from short title-abstract records around 778 characters to full-text patent-family documents around 69,000 to 72,000 characters. The group should therefore be read as a matrix of domain difficulty and representation length, not as eighteen independent tasks.
Retrieval Behavior
BM25 Profile
BM25 is much stronger on All and In variants than on Out variants. In the same-domain setting, shared IPC3 areas provide repeated technical terms, components, material names, and claim language. In OUT-domain retrieval, those anchors are weaker or absent, so BM25 must rely on partial mechanism overlap.
Representation also changes sparse behavior. Claim-bearing targets often expose more components and operations than title-abstract targets. Full text gives even more vocabulary but adds large amounts of boilerplate and unrelated legal or descriptive context. A higher BM25 score on long targets does not always mean better semantic retrieval; it may mean more lexical chances.
Dense Profile
Dense retrieval is the best profile for nearly every NanoDAPFAM variant in the current metadata. It improves most clearly in OUT-domain tasks, where the positive may share an abstract mechanism with the query even when patent-field vocabulary differs. Dense retrieval also benefits title-abstract targets, where there is less text for exact matching.
Dense performance should still be interpreted carefully. Patent retrieval depends on exact technical distinctions, claim scope, and family-level invention identity. A semantically related patent can be wrong if it solves a different problem or lacks the cited technical relation.
Reranking Hybrid Profile
reranking_hybrid often falls between BM25 and dense in nDCG@10, but it is useful for candidate generation. Patent retrieval needs both exact technical anchors and broader mechanism matching. In several title-abstract and full-text variants, the hybrid pool can preserve candidates found by either signal.
For reranker experiments, OUT-domain variants are the most important stress test. If the first-stage candidate pool misses cross-domain positives, reranking cannot recover the analogy or mechanism relation.
Task Summary
| Task | Domain | Query fields | Target fields | Positives/query | BM25 nDCG@10 | Dense nDCG@10 | Reranking hybrid nDCG@10 | Best profile |
| NanoDAPFAMAllTitlAbsClmToFullText | All | title+abstract+claims | full text | 19.95 | 0.3365 | 0.4352 | 0.4215 | Dense |
| NanoDAPFAMAllTitlAbsClmToTitlAbs | All | title+abstract+claims | title+abstract | 19.91 | 0.2864 | 0.3997 | 0.3767 | Dense |
| NanoDAPFAMAllTitlAbsClmToTitlAbsClm | All | title+abstract+claims | title+abstract+claims | 19.95 | 0.3360 | 0.4156 | 0.3989 | Dense |
| NanoDAPFAMAllTitlAbsToFullText | All | title+abstract | full text | 19.95 | 0.3489 | 0.4149 | 0.4175 | Reranking hybrid |
| NanoDAPFAMAllTitlAbsToTitlAbs | All | title+abstract | title+abstract | 19.91 | 0.3281 | 0.3786 | 0.3790 | Reranking hybrid |
| NanoDAPFAMAllTitlAbsToTitlAbsClm | All | title+abstract | title+abstract+claims | 19.95 | 0.3510 | 0.4056 | 0.4088 | Reranking hybrid |
| NanoDAPFAMInTitlAbsClmToFullText | In | title+abstract+claims | full text | 15.35 | 0.3505 | 0.4484 | 0.4375 | Dense |
| NanoDAPFAMInTitlAbsClmToTitlAbs | In | title+abstract+claims | title+abstract | 15.31 | 0.2970 | 0.4135 | 0.3805 | Dense |
| NanoDAPFAMInTitlAbsClmToTitlAbsClm | In | title+abstract+claims | title+abstract+claims | 15.35 | 0.3473 | 0.4325 | 0.4157 | Dense |
| NanoDAPFAMInTitlAbsToFullText | In | title+abstract | full text | 15.36 | 0.3490 | 0.4255 | 0.4228 | Dense |
| NanoDAPFAMInTitlAbsToTitlAbs | In | title+abstract | title+abstract | 15.33 | 0.3386 | 0.3923 | 0.3942 | Reranking hybrid |
| NanoDAPFAMInTitlAbsToTitlAbsClm | In | title+abstract | title+abstract+claims | 15.36 | 0.3593 | 0.4125 | 0.4220 | Reranking hybrid |
| NanoDAPFAMOutTitlAbsClmToFullText | Out | title+abstract+claims | full text | 6.29 | 0.0461 | 0.1010 | 0.0869 | Dense |
| NanoDAPFAMOutTitlAbsClmToTitlAbs | Out | title+abstract+claims | title+abstract | 6.29 | 0.0439 | 0.0872 | 0.0714 | Dense |
| NanoDAPFAMOutTitlAbsClmToTitlAbsClm | Out | title+abstract+claims | title+abstract+claims | 6.29 | 0.0640 | 0.0952 | 0.0811 | Dense |
| NanoDAPFAMOutTitlAbsToFullText | Out | title+abstract | full text | 6.29 | 0.0638 | 0.0952 | 0.0858 | Dense |
| NanoDAPFAMOutTitlAbsToTitlAbs | Out | title+abstract | title+abstract | 6.29 | 0.0583 | 0.0872 | 0.0762 | Dense |
| NanoDAPFAMOutTitlAbsToTitlAbsClm | Out | title+abstract | title+abstract+claims | 6.29 | 0.0699 | 0.0909 | 0.0901 | Dense |
Interpretation Notes for Model Researchers
NanoDAPFAM is best interpreted by domain condition first. All and In variants measure patent-family retrieval when the target is usually in or near the same technical area. Out variants measure harder cross-domain prior-art retrieval, where models need analogy and mechanism transfer. A strong model should reduce the Out-domain gap without merely memorizing patent families.
Representation effects should be interpreted second. Claims add legal and component detail; full text adds enormous context. Better performance on full text may reflect useful mechanism evidence, but it may also reflect more opportunities for lexical overlap. Comparing title-abstract targets with claim-bearing and full-text targets helps separate concise semantic matching from long-document term coverage.
Training and Leakage Notes
Useful training data includes patent-family citation retrieval, prior-art search pairs, patent semantic similarity, cross-IPC citation prediction, patent analogy retrieval, and field-aware training over titles, abstracts, claims, and full descriptions. Hard negatives should include same-IPC patents that share terminology but are not cited, plus cross-domain patents that share surface terms without the relevant mechanism.
Exclude NanoDAPFAM evaluation family IDs, qrels, positive target families, same-family duplicate publications, and near-duplicate patent publications from other jurisdictions. Family-level aggregation is important: using another member of the same patent family can leak the invention.
Source Reference Table
| Source | Year | Type | URL |
| DAPFAM: A Domain-Aware Family-level Dataset to benchmark cross domain patent retrieval | 2025 | paper | https://arxiv.org/abs/2506.22141 |
Metadata Summary
| Field | Value |
| Task pages | 18 |
| Queries | 3,600 |
| Split-local documents | 180,000 |
| Positive qrels | 49,879 |
| Languages | en |
| Categories | natural_language |
| Positives / query avg | 13.86 |
Task Metadata Summary
| Task | Backing dataset | Lang | Category | Queries | Docs | Positives | BM25 nDCG@10 | Dense nDCG@10 | Reranking hybrid nDCG@10 | Best profile |
| NanoDAPFAMAllTitlAbsClmToFullText | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,989 | 0.3365 | 0.4352 | 0.4215 | Dense |
| NanoDAPFAMAllTitlAbsClmToTitlAbs | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,981 | 0.2864 | 0.3997 | 0.3767 | Dense |
| NanoDAPFAMAllTitlAbsClmToTitlAbsClm | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,989 | 0.3360 | 0.4156 | 0.3989 | Dense |
| NanoDAPFAMAllTitlAbsToFullText | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,989 | 0.3489 | 0.4149 | 0.4175 | Reranking hybrid |
| NanoDAPFAMAllTitlAbsToTitlAbs | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,982 | 0.3281 | 0.3786 | 0.3790 | Reranking hybrid |
| NanoDAPFAMAllTitlAbsToTitlAbsClm | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,989 | 0.3510 | 0.4056 | 0.4088 | Reranking hybrid |
| NanoDAPFAMInTitlAbsClmToFullText | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,069 | 0.3505 | 0.4484 | 0.4375 | Dense |
| NanoDAPFAMInTitlAbsClmToTitlAbs | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,062 | 0.2970 | 0.4135 | 0.3805 | Dense |
| NanoDAPFAMInTitlAbsClmToTitlAbsClm | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,069 | 0.3473 | 0.4325 | 0.4157 | Dense |
| NanoDAPFAMInTitlAbsToFullText | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,072 | 0.3490 | 0.4255 | 0.4228 | Dense |
| NanoDAPFAMInTitlAbsToTitlAbs | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,066 | 0.3386 | 0.3923 | 0.3942 | Reranking hybrid |
| NanoDAPFAMInTitlAbsToTitlAbsClm | NanoDAPFAM | en | natural_language | 200 | 10,000 | 3,072 | 0.3593 | 0.4125 | 0.4220 | Reranking hybrid |
| NanoDAPFAMOutTitlAbsClmToFullText | NanoDAPFAM | en | natural_language | 200 | 10,000 | 1,259 | 0.0461 | 0.1010 | 0.0869 | Dense |
| NanoDAPFAMOutTitlAbsClmToTitlAbs | NanoDAPFAM | en | natural_language | 200 | 10,000 | 1,257 | 0.0439 | 0.0872 | 0.0714 | Dense |
| NanoDAPFAMOutTitlAbsClmToTitlAbsClm | NanoDAPFAM | en | natural_language | 200 | 10,000 | 1,259 | 0.0640 | 0.0952 | 0.0811 | Dense |
| NanoDAPFAMOutTitlAbsToFullText | NanoDAPFAM | en | natural_language | 200 | 10,000 | 1,259 | 0.0638 | 0.0952 | 0.0858 | Dense |
| NanoDAPFAMOutTitlAbsToTitlAbs | NanoDAPFAM | en | natural_language | 200 | 10,000 | 1,257 | 0.0583 | 0.0872 | 0.0762 | Dense |
| NanoDAPFAMOutTitlAbsToTitlAbsClm | NanoDAPFAM | en | natural_language | 200 | 10,000 | 1,259 | 0.0699 | 0.0909 | 0.0901 | Dense |