Archive.org
dataspaceGENE-ArchiveORG nameGENE-ArchiveORG skuGENE-ARCHIVEORG userReza Rafati version1.0.1

Archive.org

Activated by 6 users
Allow your AI to access archive.org with ease!

Search Internet Archive items and fetch metadata and downloadable file URLs.

SKU: GENE-ARCHIVEORG
Created: 2025-12-23 17:38:13.77139 +0000 UTC

GENE-ArchiveORG: Give Your AI Full Access to the Internet Archive

GENE-ArchiveORG connects your AI assistant and Feluda Flows to the Internet Archive — one of the largest publicly accessible digital libraries on the planet, home to hundreds of billions of web pages, millions of books, audio recordings, films, software titles, and research datasets. With this gene installed, your AI can search the Archive, read full item records, inspect file listings, write and manage item metadata, post reviews, and interact with background processing tasks — all from within your Feluda application.

Whether you are archiving research, cataloguing a collection, monitoring preservation tasks, or building an automated pipeline that reads from the historical web, GENE-ArchiveORG gives you the tools to do it in one place.

What can I do with GENE-ArchiveORG enabled?

Once the gene is connected, your AI gains the ability to search and navigate hundreds of millions of archived items by keyword, subject, creator, media type, or any combination of Lucene query fields supported by the Archive. It can fetch complete metadata records for any item — including file listings, descriptions, subjects, collections, and contributor details. For items you own or have write access to, your AI can add, update, or remove metadata fields, post and manage reviews, and submit background processing tasks such as file derivation or re-indexing. A built-in history database keeps a local record of every search, item lookup, and task submitted — so your AI can always refer back to what it has already found or done.

What tools does GENE-ArchiveORG include?

GENE-ArchiveORG comes with five powerful, production-ready tools. Every tool supports a help call that returns an immediate, plain-language summary of what the tool does, what parameters it accepts, and example uses — so your AI always has the information it needs before making a call.

ToolWhat it does
Archive SearchSearches the Internet Archive full-text index using keyword or structured Lucene queries. Supports media type filtering (texts, audio, movies, software, images, collections), field selection, deep cursor-based pagination for large result sets, and an advanced search mode with page-based navigation. Every search is logged to a local history database so your AI can revisit previous results without making unnecessary repeat requests.
Archive ItemReads the complete metadata record for any Internet Archive item by its identifier. Retrieves the full metadata block, the file manifest (every file associated with the item), or a specific individual metadata field — with built-in pagination for items that contain hundreds or thousands of files. Viewed items are tracked in a local history log.
Archive Metadata WriteAdds, replaces, or removes metadata fields on Internet Archive items you have write access to. Supports appending values to existing array fields such as subjects, sending multi-operation patches for complex updates, and targeting file-level metadata as well as item-level metadata. All write operations require your Internet Archive credentials to be configured as secrets.
Archive ReviewsReads, writes, and deletes reviews on Internet Archive items. Each Archive account can hold one review per item — writing a new review with this tool updates the existing one automatically. Star ratings from one to five are supported. Requires credentials for all operations.
Archive TasksLists, submits, reruns, and inspects the background processing tasks that the Internet Archive uses to derive files, run fixers, delete content, and more. Covers pending tasks, running tasks, completed task history, rate limit reporting, and full task log retrieval. Submitted tasks are recorded in a local history database. Requires credentials.

What is the Internet Archive?

The Internet Archive is a non-profit digital library founded in 1996. It runs the Wayback Machine — which has archived over 900 billion web pages — and hosts one of the world's largest open collections of digitised books, historical audio recordings, classic films, vintage software, live music recordings, research datasets, and more. All of this material is publicly accessible through its REST API, which GENE-ArchiveORG uses to bring that access directly into your AI and automation workflows.

Requirements and credentials

GENE-ArchiveORG is split into two categories of operations: those that are freely accessible to the public, and those that require an Internet Archive account with S3-compatible access keys.

Read-only operations — no account needed

The following tools work immediately without any secrets configured. They access publicly available content and do not require authentication:

  • Archive Search: Searching the public catalogue is completely open. You can start searching the moment the gene is installed.
  • Archive Item (read metadata and files): Reading metadata records and file listings for public items requires no authentication.
  • Write operations — Internet Archive account and S3 keys required

    The following tools require your Internet Archive S3 access credentials to be saved in the Feluda Secrets page:

  • Archive Metadata Write: Requires credentials to modify item metadata.
  • Archive Reviews: Requires credentials to read, write, or delete your account's reviews.
  • Archive Tasks: Requires credentials to list tasks for your account and to submit or rerun tasks on items you own.
  • How to get your Internet Archive S3 keys

    Step 1 — Create or log in to your Internet Archive account

    Visit archive.org and sign in. If you do not have an account yet, registration is free and instant.

    Step 2 — Generate your S3 access keys

  • Go to archive.org/account/s3.php while logged in.
  • Click Generate new keys. The page will display your access key and secret key.
  • Copy both values immediately — you will not be able to see the secret key again once you leave the page.
  • Step 3 — Save your keys in the Feluda Secrets page

    Open the Feluda Desktop App, navigate to the Secrets page, and add the following two secrets:

    Secret nameWhere to find it
    IA_S3_ACCESSThe access key shown on the Internet Archive S3 keys page.
    IA_S3_SECRETThe secret key shown on the Internet Archive S3 keys page.

    Once these are saved, all five tools in GENE-ArchiveORG become fully operational.

    Settings you can configure

    GENE-ArchiveORG exposes several settings from the Feluda Settings page that let you tune how the gene behaves without changing anything in your workflow.

    SettingDescription
    User AgentThe identifier sent to the Internet Archive with every request. The default value is appropriate for most users and complies with the Archive's usage guidelines.
    Max Tokens In ResponseControls the maximum size of a response before the gene automatically splits it into pages. Large metadata records and file listings will be chunked according to this limit, letting your AI read them in manageable portions.
    Search Page SizeThe default number of results returned per search request. The Internet Archive API requires a minimum of 100 per page — this setting lets you tune the upper end for your workflow.
    Task Page SizeThe default number of task records returned per task listing request.

    Every tool has built-in help

    Every tool in GENE-ArchiveORG has a built-in help case. When your AI calls any tool with the help option, it immediately receives a structured description of what that tool does, all the parameters it accepts with their types and constraints, which parameters are required and which are optional, and example calls for common tasks. Your AI never has to guess what a tool needs — it can always ask the tool to explain itself first.

    How GENE-ArchiveORG works inside Feluda Flows

    GENE-ArchiveORG becomes significantly more powerful when used inside Feluda Flows. Every tool in the gene can be wired into a multi-step flow, turning what would otherwise be manual one-off queries into fully automated, repeatable pipelines.

  • Research and discovery pipelines: Build a flow that searches the Archive for items matching a topic, fetches the full metadata record for each result, extracts specific fields such as creator, date, or subject, and assembles a structured report — all without any manual steps.
  • Collection cataloguing: Feed a list of Archive identifiers into a flow that fetches complete metadata and file manifests for each one, compares the results against a local reference, and flags any items that need metadata corrections.
  • Metadata correction workflows: Combine Archive Search to find items that match specific criteria with Archive Metadata Write to update a field across all matching items in one automated run — applying consistent tag, subject, or description corrections at scale.
  • Task monitoring and alerting: Set up a flow that checks the status of pending and running tasks on your Archive account at regular intervals and reports when tasks complete, error, or stall — giving you automatic visibility without having to check manually.
  • Historical web research: Use Archive Search with advanced Lucene queries as an automated research step inside a writing or analysis flow — pulling relevant archived material as background context for your AI before it produces any output.
  • Review management: Build a flow that reads back reviews from a set of your uploaded items, summarises the feedback, and drafts a response — combining Archive Reviews with your AI in a structured review-monitoring loop.
  • Real-world use cases

    Here are examples of how different users put GENE-ArchiveORG to work every day.

  • The digital archivist: A librarian managing a collection of several thousand digitised items on the Internet Archive uses the gene to audit metadata quality across the collection. A Feluda Flow fetches the metadata for each item in sequence, checks for missing fields, and submits corrections automatically — a task that previously took weeks of manual work now runs overnight.
  • The researcher: An academic studying the history of early internet culture uses Archive Search with advanced date-range and domain queries to find relevant archived pages. The gene retrieves metadata and file details for each result, which the AI summarises and organises into a structured timeline — turning days of manual browsing into a focused research session.
  • The software preservationist: A volunteer contributor to a retro computing community regularly uploads software to the Archive. They use GENE-ArchiveORG to find items in the collection that are missing subject tags or creator credits, generate appropriate corrections based on the file contents, and apply those updates in bulk using the metadata write tool — systematically improving catalogue quality across hundreds of entries.
  • The content producer: A podcast creator who archives every episode to the Internet Archive uses the task tool to submit re-derivation jobs after uploading updated audio files, and monitors those tasks from within Feluda Flows — receiving a summary the moment each job completes rather than checking the Archive manually.
  • The journalist: An investigative journalist uses Archive Search as the first step in any source research flow. The AI searches the Archive by topic and date range, reads the metadata and descriptions of the most relevant items, and produces a structured source briefing — giving the journalist a head start without hours of manual searching through the Wayback Machine and Archive catalogue.
  • Handy starting commands

    These are ready-to-use prompts you can give your AI once GENE-ArchiveORG is installed and connected. Adapt the wording to your own style — your AI will know exactly what to do.

    What you want to doWhat to say to your AIWhat happens
    Search the ArchiveSearch the Internet Archive for items related to early Linux distributions.Runs a full-text search across the Archive catalogue and returns identifiers, titles, descriptions, and download counts for the top results.
    Read an item's metadataFetch the full metadata record for the Internet Archive item with identifier GratefulDead.Retrieves the complete metadata object for the item — title, description, subjects, collection, creator, date, and all other fields.
    List an item's filesShow me all the files in the Internet Archive item NASA_Hubble_Deep_Field.Returns the file manifest — every file associated with the item, including formats, sizes, and checksums.
    Read a specific fieldWhat is the description field of Archive item gd1977-05-08.flac16?Fetches only that specific metadata field instead of pulling the entire record — efficient for large items.
    Add a subject tagAdd the subject tag "open source" to my Archive item my-software-archive-2024.Appends the new subject value to the item's existing subject array using a metadata write operation. Requires credentials.
    Check task rate limitsWhat are my current Internet Archive task rate limits for derive operations?Queries the tasks API and returns your current task limits, how many tasks are in flight, and how many are blocked.
    Submit a derive taskSubmit a derive task for my Archive item my-uploaded-recording.Queues a background derive job on the Internet Archive, which rebuilds derivative files (MP3s, thumbnails, etc.) from the source upload. Requires credentials.
    Advanced search with filtersSearch the Internet Archive for texts about World War II published before 1950, sorted by number of downloads.Runs an advanced Lucene query with media type, date range, and sort filters applied — returning structured results with pagination support.
    Write a reviewWrite a five-star review for Archive item gd1977-05-08.flac16 with the title "An essential recording" and a short description of why it matters.Posts a review to the item on behalf of your Archive account. Requires credentials. If a review already exists from your account, this updates it.

    Your credentials stay secure on your device

    Security is taken seriously at every level of how GENE-ArchiveORG handles your Internet Archive credentials.

  • Stored locally, never uploaded: Your IA_S3_ACCESS and IA_S3_SECRET values are stored in the Feluda Secrets page on your own device. They are never sent to any cloud service, third party, or external server.
  • Only sent to the Internet Archive: The only destination your keys are ever sent to is archive.org directly. All requests go from your device to the Archive's API — no intermediary is involved.
  • Keys are revokable at any time: If you ever want to disconnect the gene or rotate your credentials, you can regenerate new keys from your Internet Archive account page at archive.org/account/s3.php and update the Feluda Secrets accordingly. Old keys can be invalidated immediately from the same page.
  • Read-only mode is supported without credentials: If you only intend to search and read, you do not need to provide credentials at all. The gene works safely in a no-credentials mode for all public read operations.