PDF Indexer

PDF Indexer - Joomla PDF and DOC indexer

Index your PDF and MsWord documents and allows it's content searchable through your Joomla search functions, include Joomla Smart Search tool.

Version: 4.0 Last Updated: Mar 02 2024 Compatible: Joomla 3.9.0+, Joomla 4, Joomla 5, Joomla 6

PDF Indexer for Joomla

Make your documents searchable. Index PDF, Word, and Excel files automatically.

Latest Version: 4.0 | Compatible with Joomla 3.x, 4.x, 5.x

Key Features

Everything you need to make documents searchable

 

Multiple File Formats

Index PDF, DOC, DOCX, and XLSX documents automatically. Content is extracted and stored in the database for fast searching.

 

Bulk Indexing

Index entire folders and subfolders with one click. Save time when managing large document libraries.

 

Native Joomla Search

Seamlessly integrates with Joomla's standard search. Find documents by their content, not just filenames.

 

Auto-Sync

Automatically index new documents and remove deleted files from the search index. Keep everything synchronized.

 

EDOCMAN Integration

Works perfectly with EDOCMAN document management extension. Index and search documents managed by EDOCMAN.

 

Modern & Compatible

Supports Joomla 3.x, 4.x, 5.x, 6.x and PHP 7.x, 8.x. Integrated with Joomla Update System for easy updates.

Ready to Make Your Documents Searchable?

Get PDF Indexer today and improve your site's document search experience.

Purchase Now

Troubleshooting

 

1. Issue: pdftotext: Permission denied

When you index documents and receive the following error:

/components/com_docindexer/lib/binaries/linux/pdftotext: Permission denied

in the indexed content of PDF files, you can resolve this by changing the file permission:

  1. Navigate to: /components/com_docindexer/lib/binaries/linux/pdftotext
  2. Change the file permission to 777
  3. The issue will be resolved
 

2. How to Set Up a Cron Task to Index New Documents Automatically

Step 1: Enable the Plugin

Publish the plugin: Documents Indexer - Cron task

This plugin has several important parameters:

  • Integrate with Edocman documents: Select "Yes" if you want to index documents in the Edocman documents folder
  • Number of Documents to be Indexed: With each execution, Documents Indexer cannot index all documents at once. You should specify the number of documents to be indexed per run. The default is 5.

Step 2: Configure Cron Job

By default, Document Indexer uses a system plugin to trigger the indexing process. This means it requires someone to access the site (search engine bots are also counted) to trigger the process. Sometimes, this is not reliable or may cause multiple documents to be indexed if your site has very high traffic. To address this limitation, you can set up a cron job from your hosting account to trigger document indexing instead.

? Instructions:

Set up a cron job to make a request to this URL using CURL. Note that you should use CURL so that variables can be passed in the GET request. See this guide for detailed instructions.

URL format:

https://domain.com/index.php?trigger_code=SECRETCODE
  • Replace domain.com with the URL of your site
  • Replace SECRETCODE with the secret string you entered in the "Trigger Code" parameter of the plugin

This will ensure the indexing process is only triggered when a request is made to that URL (which should be kept secret as no real users will access it). This approach is more reliable than relying on a system plugin.

Example cron job setup:

Cron job setup example