Archive.org
Search Internet Archive items and fetch metadata and downloadable file URLs.
GENE-ArchiveORG: Give Your AI Full Access to the Internet Archive
GENE-ArchiveORG connects your AI assistant and Feluda Flows to the Internet Archive — one of the largest publicly accessible digital libraries on the planet, home to hundreds of billions of web pages, millions of books, audio recordings, films, software titles, and research datasets. With this gene installed, your AI can search the Archive, read full item records, inspect file listings, write and manage item metadata, post reviews, and interact with background processing tasks — all from within your Feluda application.
Whether you are archiving research, cataloguing a collection, monitoring preservation tasks, or building an automated pipeline that reads from the historical web, GENE-ArchiveORG gives you the tools to do it in one place.
What can I do with GENE-ArchiveORG enabled?
Once the gene is connected, your AI gains the ability to search and navigate hundreds of millions of archived items by keyword, subject, creator, media type, or any combination of Lucene query fields supported by the Archive. It can fetch complete metadata records for any item — including file listings, descriptions, subjects, collections, and contributor details. For items you own or have write access to, your AI can add, update, or remove metadata fields, post and manage reviews, and submit background processing tasks such as file derivation or re-indexing. A built-in history database keeps a local record of every search, item lookup, and task submitted — so your AI can always refer back to what it has already found or done.
What tools does GENE-ArchiveORG include?
GENE-ArchiveORG comes with five powerful, production-ready tools. Every tool supports a help call that returns an immediate, plain-language summary of what the tool does, what parameters it accepts, and example uses — so your AI always has the information it needs before making a call.
| Tool | What it does |
|---|---|
| Archive Search | Searches the Internet Archive full-text index using keyword or structured Lucene queries. Supports media type filtering (texts, audio, movies, software, images, collections), field selection, deep cursor-based pagination for large result sets, and an advanced search mode with page-based navigation. Every search is logged to a local history database so your AI can revisit previous results without making unnecessary repeat requests. |
| Archive Item | Reads the complete metadata record for any Internet Archive item by its identifier. Retrieves the full metadata block, the file manifest (every file associated with the item), or a specific individual metadata field — with built-in pagination for items that contain hundreds or thousands of files. Viewed items are tracked in a local history log. |
| Archive Metadata Write | Adds, replaces, or removes metadata fields on Internet Archive items you have write access to. Supports appending values to existing array fields such as subjects, sending multi-operation patches for complex updates, and targeting file-level metadata as well as item-level metadata. All write operations require your Internet Archive credentials to be configured as secrets. |
| Archive Reviews | Reads, writes, and deletes reviews on Internet Archive items. Each Archive account can hold one review per item — writing a new review with this tool updates the existing one automatically. Star ratings from one to five are supported. Requires credentials for all operations. |
| Archive Tasks | Lists, submits, reruns, and inspects the background processing tasks that the Internet Archive uses to derive files, run fixers, delete content, and more. Covers pending tasks, running tasks, completed task history, rate limit reporting, and full task log retrieval. Submitted tasks are recorded in a local history database. Requires credentials. |
What is the Internet Archive?
The Internet Archive is a non-profit digital library founded in 1996. It runs the Wayback Machine — which has archived over 900 billion web pages — and hosts one of the world's largest open collections of digitised books, historical audio recordings, classic films, vintage software, live music recordings, research datasets, and more. All of this material is publicly accessible through its REST API, which GENE-ArchiveORG uses to bring that access directly into your AI and automation workflows.
Requirements and credentials
GENE-ArchiveORG is split into two categories of operations: those that are freely accessible to the public, and those that require an Internet Archive account with S3-compatible access keys.
Read-only operations — no account needed
The following tools work immediately without any secrets configured. They access publicly available content and do not require authentication:
Write operations — Internet Archive account and S3 keys required
The following tools require your Internet Archive S3 access credentials to be saved in the Feluda Secrets page:
How to get your Internet Archive S3 keys
Step 1 — Create or log in to your Internet Archive account
Visit archive.org and sign in. If you do not have an account yet, registration is free and instant.
Step 2 — Generate your S3 access keys
archive.org/account/s3.php while logged in.Step 3 — Save your keys in the Feluda Secrets page
Open the Feluda Desktop App, navigate to the Secrets page, and add the following two secrets:
| Secret name | Where to find it |
|---|---|
IA_S3_ACCESS | The access key shown on the Internet Archive S3 keys page. |
IA_S3_SECRET | The secret key shown on the Internet Archive S3 keys page. |
Once these are saved, all five tools in GENE-ArchiveORG become fully operational.
Settings you can configure
GENE-ArchiveORG exposes several settings from the Feluda Settings page that let you tune how the gene behaves without changing anything in your workflow.
| Setting | Description |
|---|---|
| User Agent | The identifier sent to the Internet Archive with every request. The default value is appropriate for most users and complies with the Archive's usage guidelines. |
| Max Tokens In Response | Controls the maximum size of a response before the gene automatically splits it into pages. Large metadata records and file listings will be chunked according to this limit, letting your AI read them in manageable portions. |
| Search Page Size | The default number of results returned per search request. The Internet Archive API requires a minimum of 100 per page — this setting lets you tune the upper end for your workflow. |
| Task Page Size | The default number of task records returned per task listing request. |
Every tool has built-in help
Every tool in GENE-ArchiveORG has a built-in help case. When your AI calls any tool with the help option, it immediately receives a structured description of what that tool does, all the parameters it accepts with their types and constraints, which parameters are required and which are optional, and example calls for common tasks. Your AI never has to guess what a tool needs — it can always ask the tool to explain itself first.
How GENE-ArchiveORG works inside Feluda Flows
GENE-ArchiveORG becomes significantly more powerful when used inside Feluda Flows. Every tool in the gene can be wired into a multi-step flow, turning what would otherwise be manual one-off queries into fully automated, repeatable pipelines.
Real-world use cases
Here are examples of how different users put GENE-ArchiveORG to work every day.
Handy starting commands
These are ready-to-use prompts you can give your AI once GENE-ArchiveORG is installed and connected. Adapt the wording to your own style — your AI will know exactly what to do.
| What you want to do | What to say to your AI | What happens |
|---|---|---|
| Search the Archive | Search the Internet Archive for items related to early Linux distributions. | Runs a full-text search across the Archive catalogue and returns identifiers, titles, descriptions, and download counts for the top results. |
| Read an item's metadata | Fetch the full metadata record for the Internet Archive item with identifier GratefulDead. | Retrieves the complete metadata object for the item — title, description, subjects, collection, creator, date, and all other fields. |
| List an item's files | Show me all the files in the Internet Archive item NASA_Hubble_Deep_Field. | Returns the file manifest — every file associated with the item, including formats, sizes, and checksums. |
| Read a specific field | What is the description field of Archive item gd1977-05-08.flac16? | Fetches only that specific metadata field instead of pulling the entire record — efficient for large items. |
| Add a subject tag | Add the subject tag "open source" to my Archive item my-software-archive-2024. | Appends the new subject value to the item's existing subject array using a metadata write operation. Requires credentials. |
| Check task rate limits | What are my current Internet Archive task rate limits for derive operations? | Queries the tasks API and returns your current task limits, how many tasks are in flight, and how many are blocked. |
| Submit a derive task | Submit a derive task for my Archive item my-uploaded-recording. | Queues a background derive job on the Internet Archive, which rebuilds derivative files (MP3s, thumbnails, etc.) from the source upload. Requires credentials. |
| Advanced search with filters | Search the Internet Archive for texts about World War II published before 1950, sorted by number of downloads. | Runs an advanced Lucene query with media type, date range, and sort filters applied — returning structured results with pagination support. |
| Write a review | Write a five-star review for Archive item gd1977-05-08.flac16 with the title "An essential recording" and a short description of why it matters. | Posts a review to the item on behalf of your Archive account. Requires credentials. If a review already exists from your account, this updates it. |
Your credentials stay secure on your device
Security is taken seriously at every level of how GENE-ArchiveORG handles your Internet Archive credentials.
IA_S3_ACCESS and IA_S3_SECRET values are stored in the Feluda Secrets page on your own device. They are never sent to any cloud service, third party, or external server.archive.org directly. All requests go from your device to the Archive's API — no intermediary is involved.archive.org/account/s3.php and update the Feluda Secrets accordingly. Old keys can be invalidated immediately from the same page.Activate GENE-ARCHIVEORG
Current credits: 0
After activation: 0