The Litmus Test · Forensic IP Attribution

Was this work used to train that model?

IPLYR issues a written forensic opinion on whether a specific work is present in a specific model. Each engagement returns a binary finding, EXTRACTION CONFIRMED or NO EXTRACTION DETECTED, delivered as a signed, hash-anchored evidence bundle prepared for use before the court.

Request a per-work evidence report

Methodology draws on peer-reviewed extraction and membership-inference research. Prepared under the Daubert / FRE 702 framework. Admissibility in any particular proceeding remains the court's determination. Full technical brief available to counsel and opposing experts under MNDA.

Reviewed with IP counsel Validated against a 111-opinion fair-use corpus Referenced against 1,349 CourtListener precedents

Litmus Test · verdict

matter mat_01J9X8K2QV

Extraction confirmedstrict reporting threshold

Recall index

0.873

0.000reporting threshold1.000

Matched blocks

Longest block

214 w

Robustness

Chain-of-custody anchors

evidence:9c3a…f201 ledger:b2f0…d41a timestamp:2026-06-21T14:08Z seal:iplyr-fr-23

Prepared under Daubert / FRE 702signer trust: L2 verified

Illustrative bundle. The finding reflects the test's output at the reporting threshold set for the engagement. Legal conclusions remain with counsel.

$1.5B

Largest copyright settlement in US history (Bartz v. Anthropic, 2026)

~$0

Per-work value the settlement placed on training-data evidence

Active US copyright lawsuits against AI companies (130 worldwide, June 2026)

Aug 2026

EU AI Act Article 53 training-data transparency enforcement begins

Sources: Authors Guild & Susman Godfrey on the Anthropic settlement; ChatGPT Is Eating the World AI Copyright Case Tracker (June 1, 2026); EU Commission AI Act Article 53. IPLYR is not counsel of record in these matters.

Three doors, one platform

IPLYR is the evidence-and-attribution layer for the AI-content economy.

Pick the door that fits the work you're doing today. The forensic record underneath is the same, what changes is the deliverable.

For litigators & litigation finance

Per-work forensic evidence reports

Run the Litmus Test on a specific work and a specific model. Receive a binary verdict, a Daubert-compliant evidence bundle, hash-anchored and signed, sized for motion practice and case-portfolio underwriting.

Binary verdict + signed evidence bundle
Damages anchor aligned to Bartz settlement floor
Case-portfolio underwriting briefs for funders

Request a per-work report See methodology →

For AI labs & publishers

Training-data audits & licensing intelligence

Audit a training corpus for copyrighted exposure before regulators or plaintiffs do. Quote a deal against 270+ tracked comparators. The same forensic engine, applied upstream.

EU AI Act Article 53 audit & summary draft
Training Data Price Index + BATNA brief
Continuous monitoring across catalog scale

Scope an Article 53 audit Pricing intelligence →

Insurers

Risk quantification for AI IP liability

Pilot partnerships for carriers building dedicated AI-copyright lines. Underwriting methodology shared under NDA.

Pilot partnerships

How the evidence holds up

Tamper-evident. Independently timestamped. Signed.

Each engagement produces a written forensic opinion accompanied by a cryptographic evidence bundle prepared to withstand a motion to exclude and a hostile cross-examination.

Tamper-evident

Every artifact in the report is committed to a cryptographic ledger. Later alteration of any prompt, response, or file is detectable on inspection.

Cryptographic ledger

Independently timestamped

The record is anchored to an external, third-party timestamp. The date the evidence existed does not depend on IPLYR's servers or clock.

Independent timestamp

Signed and attributable

Each bundle carries a signed provenance seal that discloses the signer, the signer's trust level, and the methodology version applied.

Signed provenance seal

A signed seal is a verifiable record of provenance. It is not a substitute for the chain-of-custody discipline maintained by counsel. Specific cryptographic primitives and version numbers are disclosed to counsel and opposing experts under MNDA.

Built on current law

The engine knows what the courts just ruled.

IPLYR's fair-use scoring and jurisdiction model update as rulings land. Sophisticated buyers test for currency; here it is, on the homepage.

Precedent

N.D. Cal., 2026

Bartz v. Anthropic

Pirated copies aren't transformative; ~$3K/work settlement floor across ~500K titles.

Precedent

N.D. Cal., 2025

Kadrey v. Meta

Fair use found where plaintiffs failed to develop an output-infringement record.

Precedent

LG München, 2025

GEMA v. OpenAI

Memorization is permanent infringement under EU law, not transient copying.

Precedent

EWHC (UK), 2025

Getty v. Stability AI

Couldn't prove where training happened, nor that the model itself contained the works.

Precedent

D. Del., 2025

Thomson Reuters v. Ross

No fair use for a market-substitute AI built on a competitor's copyrighted database.

View full precedent coverage

Case-law database

Holdings summarized for marketing, accurate but compressed. This is product capability, not legal advice.

The problem

Plaintiffs are losing winnable cases for lack of one thing: forensic evidence.

Discovery surfaces probabilities. Trials demand proof. Without reproducible methodology and a defensible chain of custody, even strong matters collapse at Daubert.

Expert engagements run $50K-$500K and 6–12 weeks. IPLYR delivers the same evidentiary tier for $2K-$5K in 7 days.

Kadrey v. Meta

2023-

Authors alleged LLaMA was trained on Books3. The fight turned on whether plaintiffs could demonstrate per-work memorization, beyond mere inclusion in a dataset.

Evidentiary gap → per-work memorization proof

Getty v. Stability AI

2023–

Stylistic and watermark-residual claims required cross-modal attribution scoring that withstood adversarial-robustness challenges.

Evidentiary gap → cross-modal attribution scoring

Bartz settlement context

2024

Class settlements have repeatedly collapsed back to bespoke expert work. Standardized, reproducible attribution would have changed the negotiating posture.

Evidentiary gap → reproducible methodology

Product

Four pillars. One forensic chain.

From multi-modal similarity to court-ready packaging, every stage is research-backed and adversarially tested.

Attribution Engine (TQ)

Multi-modal Transformation Quotient across text, image, audio, video. 11-dim statistical + neural + adversarial-robust. Defensible similarity score with bias-corrected confidence intervals.

What's inside ›

TQ 11-dim vector · Stanford Algorithm 1 reference
Neural TQ overlay · Internal, paired with BCa CIs
CSD style fingerprinting · Somepalli et al., NeurIPS 2023
SSCD + DINOv2 image attribution · ICCV 2023

Memorization & Extraction Forensics

nv-recall, Cooper et al. PDE, Min-K%++, DE-COP, MIA ensemble, distillation/merge detection, LoRA probes, audio MIA, cross-modal MIA.

What's inside ›

Min-K%++ · Zhang et al., ICLR 2025, 26 tests
DE-COP · Duarte et al., arXiv 2402.09910, 35 tests
nv-recall · Stanford Algorithm 1, 55 tests
Cooper PDE · Cooper et al. 2024, 126 tests
Multi-teacher ensemble · SaTML 2025, 87 tests

Legal Intelligence

Four-factor fair use + Bibas + post-Warhol analyzers. 111-case validated dataset. Jurisdiction-aware US circuit-by-circuit, EU DSM, UK, DE, JP, KR. DMCA + GDPR + EU AI Act Annex III workflows. Settlement pricing engine.

What's inside ›

111-case fair-use dataset (53 FU / 58 NFU) · data/fair_use/validation_111.json
1,349 CourtListener precedents · data/courtlistener/copyright_1349.parquet
277 judge profiles · data/judges/profiles_277.json
Settlement pricing engine · anchored to Bartz $3K/work

Evidence Packaging

Daubert-tier evidence classification, cryptographic chain of custody, independent timestamping, court-ready PDF export, deposition-ready exhibits.

What's inside ›

Daubert tier classifier · SaTML 2025 framework, internal
Chain of custody · evidence/coc
Independent timestamp anchor · third-party integration
Expert declaration template · rev 2026.06

Process

How a forensic report works.

Five stages, end-to-end, with a signed evidence chain at each step.

STEP 01

Submit work + suspect model

Upload your copyrighted content and select the target LLM, image, audio, or video model. We never retain plaintext beyond the engagement window.

iplyr ▸ ingest

bash

1$ iplyr submit \
2    --work ./novel.epub \
3    --target anthropic/claude-3.7 \
4    --jurisdiction US-9 \
5    --retention 30d
6  ✔ work hashed     sha256:b2f0…d41a
7  ✔ AES-256 sealed  envelope_id env_01J…
8  → matter_id mat_01J9X8K2QV

STEP 02

Probe & extract

IPLYR runs multi-turn extraction probes, MIA ensembles, and cross-modal alignment against the target model with adversarial-robustness controls.

iplyr ▸ probe

bash

1[probe 01]  divergence-attack         seed=… budget=2048t
2[probe 02]  prefix-completion         k=64  temp=0.0
3[probe 03]  paraphrase-canary         n=128
4[probe 04]  style-fingerprint (CSD)   d=512
5[probe 05]  watermark-residual        FFT-band=π/4
6  → 16 probes complete · 14 positive · 0 inconclusive

STEP 03
Score & classify
TQ + nv-recall + concept memorization + RAG two-stage analysis produce per-block scores with confidence intervals.
findings.exhibit-A · per-block evidenceTier A
block 0007match 0.94
high
block 0011match 0.88
high
block 0014match 0.82
high
block 0019match 0.61
moderate
block 0022match 0.41
low
STEP 04
Evidentiary classification
Results are mapped to SaTML 2025 evidence tiers with Daubert compliance flags and jurisdiction-specific thresholds.
A
Daubert-ready
Reproducible, peer-reviewed methods, error rates published.
B
Pre-filing diligence
Strong indicators; supports complaint and discovery.
C
Investigative
Signal warrants further extraction probes.
D
Insufficient
No memorization detected at adversarial threshold.

STEP 05

Court-admissible package

Signed PDF with chain of custody, expert-declaration template, exhibit list, and blockchain-anchored hash. Ready for filing.

package.iplyr

json

1{
2  "package_id": "pkg_01J9X8K2QV",
3  "tier": "A",
4  "daubert_compliant": true,
5  "chain_of_custody": "sha256:9c3a…f201",
6  "btc_anchor": { "block": 832914, "txid": "5fa1…" },
7  "exhibits": ["A-Findings.pdf", "B-PerBlock.pdf", "C-Methods.pdf"],
8  "expert_declaration_template": "decl-rev-2025.03.docx",
9  "reproducible_from_hash": true
10}

TDPI Marketplace

The first price index and OTC marketplace for training data.

Discover, RFQ, clear, and settle training-data licenses with verified provenance.

Price Index

Daily reference prices for training-data classes, text, image, audio, video, code, segmented by domain and rights regime.

TWAP and VWAP across venues
Domain & jurisdiction segmentation
Subscription + API access

TDPI · live tape · USD / Mtok streaming

text.en.news$0.0146+3.00%

image.editorial$0.0085-4.27%

audio.spoken$0.0207-3.45%

video.broadcast$0.0338+2.10%

code.permissive$0.0059+2.13%

modules

API endpoints

test-validated

Get marketplace access

Modalities

Four modalities. One forensic chain.

The forensic methods change with the medium. The chain of custody does not.

Text & Code

From book-length memorization to GitHub-license contamination.

Explore

Images & Diffusion

Style attribution, dataset MIA, and concept memorization.

Explore

Audio & Music

Melodic similarity, generator ID, catalog-scale monitoring.

Explore

Video

Temporal similarity, AV-sync, generator artifact attribution.

Explore

Methods Library

40+ detection methods across the forensic stack.

Browse the detector families we combine when preparing an evidentiary report. Each entry links to a summary page. Full technical parameters are disclosed to counsel and opposing experts under MNDA.

20 of 20

Modality

Type

Tier

Min-K%++

Daubert

Zhang et al. · ICLR 2025

Calibrated membership-inference attack against pretraining data. Uses per-token log-likelihood vs. neighborhood entropy to achieve state-of-the-art AUROC across LLaMA, Pythia, GPT-NeoX families. Decouples token informativeness from membership signal, producing a clean p-value under permutation null.

Duarte et al. · arXiv 2024

Detection of copyrighted content via paraphrase-distinguishing probes. Forces the model to choose between original passages and high-quality paraphrases; preference for the original is statistically attributable to memorization.

text

35 tests

nv-recall (Stanford Algorithm 1)

Daubert

Stanford NLP · Stanford 2024

Non-verbatim recall measurement. Replaces exact-match with semantic-equivalence scoring under a calibrated neighborhood, recovering memorization signal that survives stylistic rewriting and translation.

Cooper et al. · arXiv 2024

Probabilistic discoverable extraction with confidence bounds. Estimates the probability that a target string is recoverable from a model under a budgeted prompt distribution, with formal lower-bound guarantees.

Park et al. · ICML 2023

Training-data attribution at scale via random-projection of gradients. Returns a per-training-example influence score on a target query.

Park et al. (extension) · ICML 2024

Scalable TRAK variant for billion-parameter diffusion models. Uses distilled gradient surrogates and locality-sensitive hashing for tractable attribution across catalog-scale training indices.

image

41 tests

LoRA Probes

IPLYR Internal · Internal 2025

Low-rank adapter probes that fingerprint memorization-prone parameter subspaces. Surface circuits where training-data residue concentrates.

IPLYR + Speech-MIA literature · ICASSP-adjacent 2024

Speech and music MIA combining acoustic loss-curvature, mel-spectrogram divergence, and waveform-level NN-distance.

audio

29 tests

CSD style fingerprinting

Daubert

Somepalli et al. · NeurIPS 2023

Contrastive style descriptors capturing artist-specific visual fingerprints invariant to subject matter. Validated for Getty / artist-style cases.

image

22 tests

MelodySim

Music-MIR consortium · ISMIR 2024

Melodic similarity across pitch-class profiles and rhythmic envelopes for music memorization claims.

audio

16 tests

STALL video forensics

Video-Forensics WG · CVPR-W 2024

Spatiotemporal action-level latent localization for short-form video memorization detection.

Merge-forensics literature · arXiv 2024

Forensic detection of model merges and lineage via centered kernel alignment between candidate parent models and a target.

textimage

33 tests

Known limits, disclosed

·8th Circuit case-law coverage is currently 0 cases, flagged gap for next dataset expansion.
·Audio and video modules (MelodySim, STALL) are Tier 2–3; full Daubert-tier hardening is in progress.
·Reports are evidentiary inputs, not legal conclusions. Daubert-tier classification reflects the current SaTML 2025 framework; ultimate admissibility is the court's determination.

IP Portfolio

12 patentable innovations. Filing in progress.

Independent novelty analysis identified 12 patentable inventions across attribution, forensics, and evidence packaging. The top three rated 5/5 on novelty.

Multi-Teacher MIA Ensemble

Statistical aggregation of independent membership-inference detectors using Fisher's, Stouffer's, Simes', harmonic-mean, and Cauchy combination, calibrated against dependence structure.

MIAensembleDaubert-tier

Textual Inversion Forensics

Diffusion-model attribution via inverted embedding probes. Recovers a pseudo-token best activating a target style, with significance scoring against a distractor distribution.

imageattributiondiffusion

Daubert Scoring

Automated evidentiary-tier classification mapping detector outputs to the four Daubert factors, testability, peer review, known error rates, and general acceptance.

evidenceDauberttiering

Portfolio appraised at $975K-$1.475M. Provisional applications in preparation.

Patent inquiries →

Who we serve

Built for the people who actually litigate.

Walk into deposition with a sealed evidence package.

Pre-filing diligence, expert-declaration templates, cross-jurisdiction memos, and a calibrated settlement-pricing model.

→Pre-filing diligence in 7 days
→Expert-declaration templates
→Cross-jurisdiction memos
→Settlement pricing model

Read the IP Litigators brief

IP Litigators · sample workflow

01Submit complaint and target model

02Run Tier-A extraction probes

03Generate exhibits A-C

04Hand expert declaration to ECF

Landscape

How IPLYR compares.

Independent landscape analysis of the AI training-data attribution space.

Capability	IPLYR	ProRata	Patronus	Spawning	Story
Funding	Solo / pre-seed	$30M	$17M	$6M	$54M
Daubert-tier evidence packaging
Memorization forensics	40+ methods		Hallucination focus	Opt-out registry
Jurisdictions modeled	9 + 10 US states	US	US	US	US
Pricing	$2K-$5K / report	Enterprise	Enterprise	Free opt-out	Token-based

Forensic evidence is a category none of them serve. Plaintiffs in Kadrey, Getty, and Bartz-style matters are losing for lack of exactly this.

Methods reviewed with IP attorneys at AIPLA · Validated against the 0-case fair-use dataset (53 FU / 58 NFU) · Built on 0 CourtListener copyright precedents (71M opinion clusters scanned) · 0 judge profiles · 0 landmark cases indexed · 0 distillation-moat tests passing · 0 TDPI tests across 16 modules · 193/193 smart contracts compile (OZ v5) · 9 jurisdictions + 10 US state statutes modeled · 14 LLM provider cost models.

AmLaw 100 firms

Major-label publishers

Frontier AI labs (defense)

Photo & news agencies

Author estates

EU regulators

Pricing

Built for litigation. Priced for everyone.

Forensic-grade evidence at a fraction of an expert engagement.

Show vs. traditional expert engagement

Forensic Report

$2K - $5K

per work / per model

Single-engagement evidence package, Daubert-classified, 7-day turnaround.

Tier-A evidentiary package
Per-block findings
SHA-256 chain of custody
7-day turnaround

Order a report

Most chosen

Litigation Support

from $25K

per matter

Multi-work, multi-model, expert-declaration drafting, deposition support, settlement-pricing memo.

Up to 50 works · 6 models
Expert declaration drafting
Deposition support
Settlement pricing memo

Talk to sales

Continuous Monitoring

from $5K

per month

Catalog-scale monitoring across 50+ models, weekly reports, alert on new memorization events.

50+ models monitored
Weekly forensic reports
Real-time alerting
API access

Start monitoring

Enterprise / API

Custom

white-label · on-prem · SLA

White-label, on-prem, SSO, SOC 2, audit logs, contractual SLAs.

SSO + RBAC
SOC 2 Type II
On-prem deployment
Custom SLAs

ROI

What you'd actually save.

Drag the sliders to model a matter. Baseline: $2K per work per model. Expert engagements anchor at $50K-$500K (American Lawyer expert rate surveys, 2024).

Works at issue50 works

110,000

Target models3 models

120

Jurisdictions1 jurisdictions

Anchored to settlement_pricing_engine · Bartz settlement context: $3,000 / work · IPLYR per-work cost: $6,000

Estimated engagement cost

IPLYR

$300,000

Traditional expert

$350,000

Savings14% · $50,000

Excludes deposition support (+15–20%). 10×–250× cost reduction range across realistic matter sizes.

Research

Open methods. Defensible math.

Every detector is published. Every score is reproducible from the chain-of-custody hash.

Playground

Try every detector. In your browser.

Pick an endpoint, inspect the request, watch a realistic response stream in. Outputs below are canned-but-realistic, actual calls require an API key.

request body

{
  "model": "anthropic/claude-3.7",
  "work_blocks": ["sha256:b2f0…d41a", "sha256:9c3a…f201"],
  "k_percent": 20
}

response · 200 OK

complete

json

…

40+ attribution & forensic endpoints · 8 TDPI marketplace endpoints · 6 memorization-risk endpoints. Full OpenAPI 3.1 spec at /docs/api.

Jurisdictions

One platform. Every regime that matters.

Circuit-by-circuit US analysis plus the major international copyright and AI regimes. Hover the map to drill in.

United States

Federal + 12 circuits + 10 state statutes (CA, TN, CO, UT, IL, NY, MA, TX, FL, OR)

European Union

DSM Article 4 · AI Act Annex III · EDPB

United Kingdom

CDPA · proposed TDM exception

Germany

GEMA dual-infringement framework

France

CPI · droit d'auteur

Japan

Article 30-4

South Korea

Hover the map to drill into each regime.

FAQ

Questions a litigator would ask.

IPLYR produces evidence packaged to the Daubert factors, testability, peer review, known error rates, and general acceptance. Whether any specific evidence is admitted is the court's determination. We provide the foundation; counsel argues admissibility.

Founder, IPLYR

Open to expert-witness collaborations

"I built IPLYR because I sat through too many AI copyright matters where plaintiffs had a real case and no forensic vocabulary to prove it.

The methods exist in the literature. Min-K%++, nv-recall, Cooper PDE, TRAK, MIA ensembles, but they live in conference papers, not in litigation workflows. IPLYR translates them into evidentiary-grade packages with chain of custody, Daubert classification, and jurisdiction-aware analysis.

Methods reviewed with IP attorneys at AIPLA. Open to expert-witness collaborations and academic partnerships."

Talk to founder or send a note →

Eight revenue lanes