atextcrawler.search package

Submodules

atextcrawler.search.engine module

Search engine, for now elasticsearch.

We have one index per supported language and a default one.

async atextcrawler.search.engine.close_indices(engine)

Close indices. UNUSED.

async atextcrawler.search.engine.create_indices(engine)

Create indices for all configured langiages.

async atextcrawler.search.engine.delete_resource(engine, lang, resource_id)

Delete a resource.

async atextcrawler.search.engine.find_duplicate(engine, site_id, resource) Union[bool, None, int]

UNUSED.

Try to find a duplicate resource with matching site.

If the search backend query fails, return False. If no matching resource was found, return None. If a matching resource was found, return its id.

async atextcrawler.search.engine.index_resource(engine, tf, site_path, resource, base_url, url)

Index a resource.

async atextcrawler.search.engine.open_indices(engine)

Open indices for all configure languages.

async atextcrawler.search.engine.shutdown_engine(engine)

Close the connection to the search engine.

async atextcrawler.search.engine.startup_engine(config)

Open the search engine for access.

Module contents