The Litmus Test · Forensic IP Attribution

Was this work used to train that model?

IPLYR runs a forensic extraction test and returns a binary verdict — EXTRACTION CONFIRMED or NO EXTRACTION DETECTED — with a signed, hash-anchored evidence bundle you can put in front of a judge.

Request a per-work evidence report

Method based on Ahmed, Cooper, Koyejo & Liang, Extracting books from production language models (arXiv:2601.02671). Designed for Daubert / FRE 702 admissibility — not a claim of prior court acceptance.

Reviewed with IP attorneys at AIPLA Validated against 111-case fair-use dataset Built on 1,349 CourtListener precedents
Litmus Test · verdict
matter mat_01J9X8K2QV
Extraction confirmedat p < 0.001 (strict)
Near-verbatim recall
0.873
0.000Stanford moderate-risk floor 0.0011.000
Verbatim blocks
14
Longest block
214 w
Jailbreak score
0.04
Chain-of-custody anchors
sha256:9c3a…f201 merkle:b2f0…d41a ots:2026-06-21T14:08Z c2pa:iplyr-fr-23
Daubert/FRE 702 compliant bundlesigner trust: L2 verified

Method: nv-recall block matching (Ahmed, Cooper, Koyejo & Liang — arXiv:2601.02671). A verdict is the test's output at a chosen sensitivity threshold, not a legal conclusion.

$1.5B
Largest copyright settlement in US history (Bartz v. Anthropic, 2026)
~$0
Per-work value the settlement placed on training-data evidence
0
Active US copyright lawsuits against AI companies (130 worldwide, June 2026)
Aug 2026
EU AI Act Article 53 training-data transparency enforcement begins

Sources: Authors Guild & Susman Godfrey on the Anthropic settlement; ChatGPT Is Eating the World AI Copyright Case Tracker (June 1, 2026); EU Commission AI Act Article 53. IPLYR is not counsel of record in these matters.

Three doors, one platform

IPLYR is the evidence-and-attribution layer for the AI-content economy.

Pick the door that fits the work you're doing today. The forensic record underneath is the same — what changes is the deliverable.

For litigators & litigation finance

Per-work forensic evidence reports

Run the Litmus Test on a specific work and a specific model. Receive a binary verdict, a Daubert-compliant evidence bundle, hash-anchored and signed — sized for motion practice and case-portfolio underwriting.

  • Binary verdict + signed evidence bundle
  • Damages anchor aligned to Bartz settlement floor
  • Case-portfolio underwriting briefs for funders
For AI labs & publishers

Training-data audits & licensing intelligence

Audit a training corpus for copyrighted exposure before regulators or plaintiffs do. Quote a deal against 270+ tracked comparators. The same forensic engine — applied upstream.

  • EU AI Act Article 53 audit & summary draft
  • Training Data Price Index + BATNA brief
  • Continuous monitoring across catalog scale
How the evidence holds up

Tamper-evident. Independently timestamped. Signed.

Competitors produce PDFs. IPLYR produces a cryptographic evidence bundle that survives a motion to exclude and a hostile cross-examination.

01

Tamper-evident

Every artifact in the report is hashed into an RFC 6962 / 9162 Merkle tree. Any later alteration to a single line of text, a single prompt, or a single response is mathematically detectable.

RFC 6962 · SHA-256
02

Independently timestamped

The evidence root is anchored to an external OpenTimestamps proof on Bitcoin's blockchain — establishing when the record existed without trusting IPLYR's clock or servers.

OpenTimestamps · BTC anchor
03

Signed & attributable

Each package carries a C2PA signed seal disclosing the signer, the signer's trust level, and the method version. No anonymous, unverifiable claims.

C2PA · signer trust disclosed

Honesty note: We disclose signer trust level and never assert a timestamp or signature we cannot produce. A signed seal is a verifiable record of provenance — not a substitute for chain-of-custody discipline by counsel.

Built on current law

The engine knows what the courts just ruled.

IPLYR's fair-use scoring and jurisdiction model update as rulings land. Sophisticated buyers test for currency; here it is, on the homepage.

Precedent
N.D. Cal., 2026

Bartz v. Anthropic

Pirated copies aren't transformative; ~$3K/work settlement floor across ~500K titles.

Precedent
N.D. Cal., 2025

Kadrey v. Meta

Fair use found where plaintiffs failed to develop an output-infringement record.

Precedent
LG München, 2025

GEMA v. OpenAI

Memorization is permanent infringement under EU law — not transient copying.

Precedent
EWHC (UK), 2025

Getty v. Stability AI

Couldn't prove where training happened, nor that the model itself contained the works.

Precedent
D. Del., 2025

Thomson Reuters v. Ross

No fair use for a market-substitute AI built on a competitor's copyrighted database.

View full precedent coverage
Case-law database

Holdings summarized for marketing — accurate but compressed. This is product capability, not legal advice.

The problem

Plaintiffs are losing winnable cases for lack of one thing: forensic evidence.

Discovery surfaces probabilities. Trials demand proof. Without reproducible methodology and a defensible chain of custody, even strong matters collapse at Daubert.

Expert engagements run $50K–$500K and 6–12 weeks. IPLYR delivers the same evidentiary tier for $2K–$5K in 7 days.

Kadrey v. Meta

2023–

Authors alleged LLaMA was trained on Books3. The fight turned on whether plaintiffs could demonstrate per-work memorization, not just inclusion in a dataset.

Evidentiary gap → per-work memorization proof

Getty v. Stability AI

2023–

Stylistic and watermark-residual claims required cross-modal attribution scoring that withstood adversarial-robustness challenges.

Evidentiary gap → cross-modal attribution scoring

Bartz settlement context

2024

Class settlements have repeatedly collapsed back to bespoke expert work. Standardized, reproducible attribution would have changed the negotiating posture.

Evidentiary gap → reproducible methodology
Product

Four pillars. One forensic chain.

From multi-modal similarity to court-ready packaging — every stage is research-backed and adversarially tested.

02

Memorization & Extraction Forensics

nv-recall, Cooper et al. PDE, Min-K%++, DE-COP, MIA ensemble, distillation/merge detection, LoRA probes, audio MIA, cross-modal MIA.

What's inside
03

Legal Intelligence

Four-factor fair use + Bibas + post-Warhol analyzers. 111-case validated dataset. Jurisdiction-aware US circuit-by-circuit, EU DSM, UK, DE, JP, KR. DMCA + GDPR + EU AI Act Annex III workflows. Settlement pricing engine.

What's inside
  • 111-case fair-use dataset (53 FU / 58 NFU) · data/fair_use/validation_111.json
  • 1,349 CourtListener precedents · data/courtlistener/copyright_1349.parquet
  • 277 judge profiles · data/judges/profiles_277.json
  • Settlement pricing engine · anchored to Bartz $3K/work
04

Evidence Packaging

Daubert-tier evidence classification, SHA-256 chain of custody, blockchain anchoring, court-ready PDF export, deposition-ready exhibits.

What's inside
  • Daubert tier classifier · SaTML 2025 framework — internal
  • SHA-256 chain of custody · evidence/coc.py
  • Bitcoin anchoring · OpenTimestamps integration
  • Expert declaration template · rev 2025.03
Process

How a forensic report works.

Five stages, end-to-end, with a signed evidence chain at each step.

  1. STEP 01

    Submit work + suspect model

    Upload your copyrighted content and select the target LLM, image, audio, or video model. We never retain plaintext beyond the engagement window.

    iplyr ▸ ingest
    bash
    1$ iplyr submit \
    2 --work ./novel.epub \
    3 --target anthropic/claude-3.7 \
    4 --jurisdiction US-9 \
    5 --retention 30d
    6work hashed sha256:b2f0d41a
    7AES-256 sealed envelope_id env_01J
    8matter_id mat_01J9X8K2QV
  2. STEP 02

    Probe & extract

    IPLYR runs multi-turn extraction probes, MIA ensembles, and cross-modal alignment against the target model with adversarial-robustness controls.

    iplyr ▸ probe
    bash
    1[probe 01] divergence-attack seed=… budget=2048t
    2[probe 02] prefix-completion k=64 temp=0.0
    3[probe 03] paraphrase-canary n=128
    4[probe 04] style-fingerprint (CSD) d=512
    5[probe 05] watermark-residual FFT-band/4
    616 probes complete · 14 positive · 0 inconclusive
  3. STEP 03

    Score & classify

    TQ + nv-recall + concept memorization + RAG two-stage analysis produce per-block scores with confidence intervals.

    findings.exhibit-A · per-block evidenceTier A
    block 0007match 0.94
    high
    block 0011match 0.88
    high
    block 0014match 0.82
    high
    block 0019match 0.61
    moderate
    block 0022match 0.41
    low
  4. STEP 04

    Evidentiary classification

    Results are mapped to SaTML 2025 evidence tiers with Daubert compliance flags and jurisdiction-specific thresholds.

    A
    Daubert-ready
    Reproducible, peer-reviewed methods, error rates published.
    B
    Pre-filing diligence
    Strong indicators; supports complaint and discovery.
    C
    Investigative
    Signal warrants further extraction probes.
    D
    Insufficient
    No memorization detected at adversarial threshold.
  5. STEP 05

    Court-admissible package

    Signed PDF with chain of custody, expert-declaration template, exhibit list, and blockchain-anchored hash. Ready for filing.

    package.iplyr
    json
    1{
    2 "package_id": "pkg_01J9X8K2QV",
    3 "tier": "A",
    4 "daubert_compliant": true,
    5 "chain_of_custody": "sha256:9c3a…f201",
    6 "btc_anchor": { "block": 832914, "txid": "5fa1…" },
    7 "exhibits": ["A-Findings.pdf", "B-PerBlock.pdf", "C-Methods.pdf"],
    8 "expert_declaration_template": "decl-rev-2025.03.docx",
    9 "reproducible_from_hash": true
    10}
TDPI Marketplace

The first price index and OTC marketplace for training data.

Discover, RFQ, clear, and settle training-data licenses with verified provenance.

Price Index

Daily reference prices for training-data classes — text, image, audio, video, code — segmented by domain and rights regime.

  • TWAP and VWAP across venues
  • Domain & jurisdiction segmentation
  • Subscription + API access
TDPI · live tape · USD / Mtok streaming
text.en.news$0.0146+3.00%
image.editorial$0.0085-4.27%
audio.spoken$0.0207-3.45%
video.broadcast$0.0338+2.10%
code.permissive$0.0059+2.13%
0
modules
0
API endpoints
0
test-validated
Modalities

Four modalities. One forensic chain.

The forensic methods change with the medium. The chain of custody does not.

Methods Library

40+ detection methods. Every one cited.

Search, filter and inspect every detector in the IPLYR forensic stack. Each method links to its paper and to a /methods slug page with implementation, test count, and adversarial-robustness notes.

20 of 20
Modality
Type
Tier
Min-K%++
Daubert
Zhang et al. · ICLR 2025

Calibrated membership-inference attack against pretraining data. Uses per-token log-likelihood vs. neighborhood entropy to achieve state-of-the-art AUROC across LLaMA, Pythia, GPT-NeoX families. Decouples token informativeness from membership signal, producing a clean p-value under permutation null.

text
26 tests
DE-COP
Daubert
Duarte et al. · arXiv 2024

Detection of copyrighted content via paraphrase-distinguishing probes. Forces the model to choose between original passages and high-quality paraphrases; preference for the original is statistically attributable to memorization.

text
35 tests
nv-recall (Stanford Algorithm 1)
Daubert
Stanford NLP · Stanford 2024

Non-verbatim recall measurement. Replaces exact-match with semantic-equivalence scoring under a calibrated neighborhood, recovering memorization signal that survives stylistic rewriting and translation.

text
55 tests
Cooper PDE
Daubert
Cooper et al. · arXiv 2024

Probabilistic discoverable extraction with confidence bounds. Estimates the probability that a target string is recoverable from a model under a budgeted prompt distribution, with formal lower-bound guarantees.

text
126 tests
TRAK
Daubert
Park et al. · ICML 2023

Training-data attribution at scale via random-projection of gradients. Returns a per-training-example influence score on a target query.

textimage
36 tests
Distilled TRAK
Daubert
Park et al. (extension) · ICML 2024

Scalable TRAK variant for billion-parameter diffusion models. Uses distilled gradient surrogates and locality-sensitive hashing for tractable attribution across catalog-scale training indices.

image
41 tests
LoRA Probes
IPLYR Internal · Internal 2025

Low-rank adapter probes that fingerprint memorization-prone parameter subspaces. Surface circuits where training-data residue concentrates.

textimage
18 tests
Audio MIA framework
Daubert
IPLYR + Speech-MIA literature · ICASSP-adjacent 2024

Speech and music MIA combining acoustic loss-curvature, mel-spectrogram divergence, and waveform-level NN-distance.

audio
29 tests
CSD style fingerprinting
Daubert
Somepalli et al. · NeurIPS 2023

Contrastive style descriptors capturing artist-specific visual fingerprints invariant to subject matter. Validated for Getty / artist-style cases.

image
22 tests
MelodySim
Music-MIR consortium · ISMIR 2024

Melodic similarity across pitch-class profiles and rhythmic envelopes for music memorization claims.

audio
16 tests
STALL video forensics
Video-Forensics WG · CVPR-W 2024

Spatiotemporal action-level latent localization for short-form video memorization detection.

video
14 tests
FUMA + CKA / REEF
Daubert
Merge-forensics literature · arXiv 2024

Forensic detection of model merges and lineage via centered kernel alignment between candidate parent models and a target.

textimage
33 tests
Known limits — disclosed
  • ·8th Circuit case-law coverage is currently 0 cases — flagged gap for next dataset expansion.
  • ·Audio and video modules (MelodySim, STALL) are Tier 2–3; full Daubert-tier hardening is in progress.
  • ·Reports are evidentiary inputs, not legal conclusions. Daubert-tier classification reflects the current SaTML 2025 framework; ultimate admissibility is the court's determination.
IP Portfolio

12 patentable innovations. Filing in progress.

Independent novelty analysis identified 12 patentable inventions across attribution, forensics, and evidence packaging. The top three rated 5/5 on novelty.

Multi-Teacher MIA Ensemble

Statistical aggregation of independent membership-inference detectors using Fisher's, Stouffer's, Simes', harmonic-mean, and Cauchy combination — calibrated against dependence structure.

MIAensembleDaubert-tier

Textual Inversion Forensics

Diffusion-model attribution via inverted embedding probes. Recovers a pseudo-token best activating a target style, with significance scoring against a distractor distribution.

imageattributiondiffusion

Daubert Scoring

Automated evidentiary-tier classification mapping detector outputs to the four Daubert factors — testability, peer review, known error rates, and general acceptance.

evidenceDauberttiering
Portfolio appraised at $975K–$1.475M. Provisional applications in preparation.
Patent inquiries →
Who we serve

Built for the people who actually litigate.

Walk into deposition with a sealed evidence package, not a hunch.

Pre-filing diligence, expert-declaration templates, cross-jurisdiction memos, and a calibrated settlement-pricing model.

  • Pre-filing diligence in 7 days
  • Expert-declaration templates
  • Cross-jurisdiction memos
  • Settlement pricing model
Read the IP Litigators brief
IP Litigators · sample workflow
01Submit complaint and target model
02Run Tier-A extraction probes
03Generate exhibits A–C
04Hand expert declaration to ECF
Landscape

How IPLYR compares.

Independent landscape analysis of the AI training-data attribution space.

CapabilityIPLYRProRataPatronusSpawningStory
FundingSolo / pre-seed$30M$17M$6M$54M
Daubert-tier evidence packaging
Memorization forensics40+ methodsHallucination focusOpt-out registry
Jurisdictions modeled9 + 10 US statesUSUSUSUS
Pricing$2K–$5K / reportEnterpriseEnterpriseFree opt-outToken-based

Forensic evidence is a category none of them serve. Plaintiffs in Kadrey, Getty, and Bartz-style matters are losing for lack of exactly this.

Methods reviewed with IP attorneys at AIPLA · Validated against the 0-case fair-use dataset (53 FU / 58 NFU) · Built on 0 CourtListener copyright precedents (71M opinion clusters scanned) · 0 judge profiles · 0 landmark cases indexed · 0 distillation-moat tests passing · 0 TDPI tests across 16 modules · 193/193 smart contracts compile (OZ v5) · 9 jurisdictions + 10 US state statutes modeled · 14 LLM provider cost models.

AmLaw 100 firms
Major-label publishers
Frontier AI labs (defense)
Photo & news agencies
Author estates
EU regulators
Pricing

Built for litigation. Priced for everyone.

Forensic-grade evidence at a fraction of an expert engagement.

Show vs. traditional expert engagement
Forensic Report
$2K – $5K
per work / per model

Single-engagement evidence package, Daubert-classified, 7-day turnaround.

  • Tier-A evidentiary package
  • Per-block findings
  • SHA-256 chain of custody
  • 7-day turnaround
Order a report
Most chosen
Litigation Support
from $25K
per matter

Multi-work, multi-model, expert-declaration drafting, deposition support, settlement-pricing memo.

  • Up to 50 works · 6 models
  • Expert declaration drafting
  • Deposition support
  • Settlement pricing memo
Talk to sales
Continuous Monitoring
from $5K
per month

Catalog-scale monitoring across 50+ models, weekly reports, alert on new memorization events.

  • 50+ models monitored
  • Weekly forensic reports
  • Real-time alerting
  • API access
Start monitoring
Enterprise / API
Custom
white-label · on-prem · SLA

White-label, on-prem, SSO, SOC 2, audit logs, contractual SLAs.

  • SSO + RBAC
  • SOC 2 Type II
  • On-prem deployment
  • Custom SLAs
Contact us
ROI

What you'd actually save.

Drag the sliders to model a matter. Baseline: $2K per work per model. Expert engagements anchor at $50K–$500K (American Lawyer expert rate surveys, 2024).

50 works
110,000
3 models
120
1 jurisdictions
19
Anchored to settlement_pricing_engine · Bartz settlement context: $3,000 / work · IPLYR per-work cost: $6,000
Estimated engagement cost
IPLYR
$300,000
Traditional expert
$350,000
Savings14% · $50,000
Excludes deposition support (+15–20%). 10×–250× cost reduction range across realistic matter sizes.
Research

Open methods. Defensible math.

Every detector is published. Every score is reproducible from the chain-of-custody hash.

Playground

Try every detector. In your browser.

Pick an endpoint, inspect the request, watch a realistic response stream in. Outputs below are canned-but-realistic — actual calls require an API key.

request body
{
  "model": "anthropic/claude-3.7",
  "work_blocks": ["sha256:b2f0…d41a", "sha256:9c3a…f201"],
  "k_percent": 20
}
response · 200 OK
complete
json

40+ attribution & forensic endpoints · 8 TDPI marketplace endpoints · 6 memorization-risk endpoints. Full OpenAPI 3.1 spec at /docs/api.

Jurisdictions

One platform. Every regime that matters.

Circuit-by-circuit US analysis plus the major international copyright and AI regimes. Hover the map to drill in.

USEUUKDEFRJPKR
United States
Federal + 12 circuits + 10 state statutes (CA, TN, CO, UT, IL, NY, MA, TX, FL, OR)
European Union
DSM Article 4 · AI Act Annex III · EDPB
United Kingdom
CDPA · proposed TDM exception
Germany
GEMA dual-infringement framework
France
CPI · droit d'auteur
Japan
Article 30-4
South Korea
Copyright Act amendments
Hover the map to drill into each regime.
FAQ

Questions a litigator would ask.

IPLYR produces evidence packaged to the Daubert factors — testability, peer review, known error rates, and general acceptance. Whether any specific evidence is admitted is the court's determination. We provide the foundation; counsel argues admissibility.
IP
Founder, IPLYR
Open to expert-witness collaborations

"I built IPLYR because I sat through too many AI copyright matters where plaintiffs had a real case and no forensic vocabulary to prove it.

The methods exist in the literature — Min-K%++, nv-recall, Cooper PDE, TRAK, MIA ensembles — but they live in conference papers, not in litigation workflows. IPLYR translates them into evidentiary-grade packages with chain of custody, Daubert classification, and jurisdiction-aware analysis.

Methods reviewed with IP attorneys at AIPLA. Open to expert-witness collaborations and academic partnerships."

Eight revenue lanes

Not here for a forensic report? Pick a lane.

IPLYR is a platform. Each lane has its own audience, narrative, pricing, and CTA.

If you're litigating against an AI lab, you can't afford to walk in without this.

SOC 2 Type II · GDPR · EU AI Act Annex III workflows

IPLYR is a forensic evidence platform. We do not provide legal advice or represent parties in litigation.