
#InternetArchive will automagically OCR uploaded PDF files made from images and I presume this makes them searchable as well.
Is there a way to tell it to OCR uploaded image files and make them searchable?
#InternetArchive will automagically OCR uploaded PDF files made from images and I presume this makes them searchable as well.
Is there a way to tell it to OCR uploaded image files and make them searchable?
"Look Ma, I copied the entire internet! #HTTrack, the digital hoarder's dream, lets you download the web so you can finally browse those cat memes offline. Because nothing screams cutting-edge technology like reading 2005 forum threads in 2023."
https://www.httrack.com/ #OfflineBrowsing #DigitalHoarding #InternetArchive #CatMemes #Nostalgia #HackerNews #ngated
It comes from a loose coalition of #archivists & #librarians, who are standing athwart #history & yelling “Save!” They belong to organizations such as the #InternetArchive, which co-created a project called the End of Term Web Archive to back up the federal web in 2008; the Environmental Data & Governance Initiative, or #edgi; & #libraries at major #universities such as #MIT & the University of #Michigan.
New post: An Internet Archive Plugin for Craft CMS 5
#craftcms #plugin #internetarchive #indieweb
https://matthiasott.com/notes/an-internet-archive-plugin-for-craft-cms-5
The erasure of this Black #MedalOfHonor recipient is another example of what I'm helping to document on @wikipedia with the help of the @internetarchive.
So here's a question... why do record labels, publishers, etc. suddenly have such a big interest in suing the Internet Archive in the past few years, despite the Archive having been around for many years doing basically the same things, and there being 0 chance that the IP companies didn't know about it?
Pues me acabo de enterar de que Wayback Machine @internetarchive tiene una extensión genial con versiones para todos los navegadores más usados. De la descripción:
"Retrocede en el tiempo para ver cómo ha cambiado un sitio web a lo largo de la historia de la Red, guarda sitios web, accede a páginas 404 no encontradas, o lee libros y artículos archivados."
Esta es la de Firefox (y derivados):
https://addons.mozilla.org/en-US/firefox/addon/wayback-machine_new/
Installed and started the ArchiveTeam Warrior. Very smooth experience.
It downloads stuff and puts it into the Internet Archive.
I took the "ArchiveTeam’s Choice" project and it chose public telegram channels. It's not taking a lot of bandwidth or memory or space or computing, as far as I can tell. It might take too much of my time and focus if I continue staring at the dashboard to try and figure out what all that stuff is.
Internet Archive has a lot of recordings of sound effects records from the 78 rpm era:
I didn't find the legendary Columbia YB-20, but I'm sure it's around
@Rocket @internetarchive There are extensions to automate it so that every single page you visit is automatically saved if it hasn't been saved in the last 30 days.
I use the Internet Archive Saver, which works on Chrome, Safari, Firefox, Edge, etc
https://greasyfork.org/en/scripts/391088-internet-archive-saver/code
This Saturday at 10 AM, just after the opening of #FtCSF, @mark will be presenting the incredible and invaluable work the #WaybackMachine team is doing to preserve US government’s webpages from the #Biden era that have been erased by the new administration, and more! Join us at the @internetarchive
Info+tickets: https://fundingthecommons.io/sf-2025
@protocollabs #SanFrancisco #WaybackMachine #InternetArchive #OpenKnowledge #USpol
The Internet Archive is preserving old and obsolete 78RPM recordings for posterity, but major labels are claiming copyright infringement.
#FundingTheCommons is happening this weekend!
Join us at the @internetarchive in San Francisco to discuss about new models and mechanisms for commons+public goods funding.
Sunday 3/16 at 10 AM #DWeb’s own Wendy Hanamura and #EthereumFoundation’s president Aya Miyaguchi will explore the values, decisions, challenges, and future direction of the #Ethereum community.
Info+tickets: https://fundingthecommons.io/sf-2025
@protocollabs #FtCSF #SF #SanFrancisco #AyaMiyaguchi #WendyHanamura #InternetArchive
Music labels will regret coming for the #InternetArchive, sound historian says - https://arstechnica.com/tech-policy/2025/03/music-labels-will-regret-coming-for-the-internet-archive-sound-historian-says/ " Labels push to spike cost of Internet Archive fight over old 78s. " #copyright
I just archived this web page on the #InternetArchive. I plan on doing that with every US government page that I come across with useful information!
#OrganicPhotovoltaics Research
"The benefits promised by #OPV #SolarCells include:
Low-cost manufacturing: Soluble organic molecules enable roll-to-roll processing techniques and allow for low-cost manufacturing.
Abundant materials: The wide abundance of building-block materials may reduce supply and price constraints.
Flexible substrates: The ability to be applied to flexible substrates permits a wide variety of uses."
https://www.energy.gov/eere/solar/organic-photovoltaics-research
Archived version:
https://archive.ph/jhrg9
#SolarPunkSunday #SolarPunk #SolarPower #RenewableEnergy #RenewablesNow
SparkFun were great at keeping all of the product details for their retired products on their website. But that's all gone with the redesign.
If you've got some old SparkFun kit, you're going to be relying on the Internet Archive a lot now.
You Can Download Instructions for Over 6,800 Lego Sets For Free at the Internet Archive via My Modern Met [Shared]
Nothing beats the excitement of opening a brand new LEGO set and unloading those shiny bricks. Sometimes, that happy commotion leads to misplacing a key part of the building process—the instructions. If that's happened to you, then you may want to check out the Internet Archive. They have a database of over 6,800 LEGO set instructions for models from many different eras and types.
The Internet archive at https://web.archive.org is not working properly for me right now. Is this a local glitch? Is anyone confirming the issue?
"Home movies [...] ancestors of the videos we shot on camcorders and now capture on cell phones. We might think of each home movie as a pixel in a giant collective documentary spanning a hundred years, endless films picturing family, friends, travels, rituals and celebrations."
https://blog.archive.org/2025/03/05/vanishing-culture-no-film-left-unscanned/
Update. "EDGI Relaunches Federal Environmental Web Tracker"
https://envirodatagov.org/press-release-edgi-relaunches-federal-environmental-web-tracker/
"In response to the #Trump administration’s rapid dismantling of federal websites, the Environmental Data & Governance Initiative (#EDGI) has relaunched its Federal Environmental Web Tracker…The… Tracker makes records of significant changes to federal environmental websites publicly available in a searchable database…Since the first Trump administration, EDGI has monitored thousands of federal environmental webpages. Partners at the #InternetArchive download these webpages every day, and EDGI’s #OpenSource software compares versions of these webpages to identify differences."