What is federated search and why is it so good?

With so much data now in existence in the world, having a source for intelligence at every turn can often be a double-edged sword. It is great to be able to find an address for someone but having to log into many different systems to find it can be time consuming and problematic.

It is an issue that exists right across law enforcement, siloed data. Most law enforcement investigations start with a name, address, or telephone number that you wish to find out more about. Often, that data on the entity will be in multiple different places. If you have lots of systems, sometimes up to 30 like you do in policing, going to all of them in turn could take a lot of time…too much time. So, do you stop searching ones that do not give you great results or are difficult to access?

Do you run the risk of missing the golden nugget of information that could make the difference in an investigation?

Thankfully, there is a knight in shining armour to go with the sword. Federated Search.

It is a bit of a technical term but let me explain what it is and how it works.

What is federated search?

Federated search is about asking one question to several different systems simultaneously. In the policing context, this could mean searching your records management system, social media, a network drive, and a cloud environment all at once. It is exactly what our new tool, Chorus Search, can do.

It saves time and any results that might be related to each other are displayed in the same system. If you are looking for correlations, then you can look across the data with a single lens.

How Chorus federated search works

Federated search uses a question that is typed in free text and first we assign a semantic type to that search query. This type helps to describe the kind of information the data represents. ‘John Smith’ is a good example of this as it could be a person or a beer. We have developed a framework that is able to assign a semantic type by matching the search term to a type (ie identifying first name and last name would mean it’s a person), but this can also be overridden by the user.

This saves time and narrows the results when the search is executed.

Once we have the search term and the semantic type, we then translate that to whichever query languages the source systems we want to search use, so common queries can be executed and results returned.

Connecting to different systems

To search all these different systems, we first need to connect to them.

Here we have different strategies depending on where the data sits and if the database is written in a language that can be accessed via an Application Program Interface (API) or Software Development Kit (SDK).

If the data is online or sits somewhere that can interface with the outside world then we can connect via APIs or SDKs and ask the question of that data through these connections.

If it has a database and is built using Structured Query Language (SQL) such as iBase, then it has a query language interface, and we can pose a question and get the results back.

For these scenarios we develop a connector. That connector acts as an interpreter between our system and the target data source, translating the query into the right language for the data source and then back again so we can represent the results visually in Chorus Search.

Social media search

If the data really is in a silo that does not have any connectivity, then we perform an extract, transform, and load (ETL) process. We lift that data and put it somewhere we can access, like iBase, so we can then connect to it using the translation method above.

That ETL is performed based on the rate of change of the data. If that data is historic and is not updated anymore, once is sufficient. If it is something that changes regularly then it could be done on a weekly basis according to how frequently the user needs to see that change.

What is good about federated search?

I’ve already mentioned the time saving benefits of federated search, but it has a few more up its sleeve that will be valuable to those conducting digital investigations.

You can be confident that you have explored all available data, reducing the risk that you miss something from somewhere that is hard to access.

The set up of federated search can be more secure than querying separate systems individually. Just one ID is required rather than generating log ins for all the different systems. We all know how hard is to remember all our passwords!

And only having to log into one system means that you only need to be trained on one piece of software.

Finally, auditing is an important aspect of digital investigations. Chorus Search has an audit function built in, so any auditor has just one place to go to find out what information has been accessed by who and when.

All these things help make Chorus Search one of the most exciting and beneficial intelligence tools out there. But now you know there is plenty going on behind the simple search box.

For more information visit https://chorusintel.com/product/search/

Post provided by Adam Etches – Technical Director – Chorus


Last Post

Location Data from National ANPR Service

May 27, 2021 By Eisha Cooke

Up Next

Top 10 Tips for Being Confident in Court

June 7, 2021 By Eisha Cooke

Request a demo. Be empowered.

Seeing is believing. Request a demo to see our Intelligence Suite in action and how it can help solve your data challenges.