The Internet is the largest collection of data humanity has ever created. And that is something that obviously has not gone unnoticed by the governments, Law Enforcement and even the private sector. Nowadays, many post their personal information freely and willingly on social networks or all over the net. Why not use this gigantic and free dataset and analyze it with AI?
What is OSINT?
During the days of the Cold War gathering intelligence about the enemy was all the rage. In those days HUMINT (human intelligence, gathered from a person), COMINT (communications intelligence, gathered from voice communications) and SIGINT (signals intelligence, from electronic signals that are not COMINT) where the main sources of intelligence, although there are some others.
Then came the Internet. In the beginning nothing much changed, but once the Internet went mainstream and the social networks and the Smartphone arrived, everyone had suddenly the opportunity to publish anything online in a matter of seconds. With worldwide reach.
And this created a whole new category of intelligence: OSINT (Open Source Intelligence). This is, intelligence produced by collecting publicly available information. To be frank, OSINT is actually much older and dates from WWII, when intelligence was gathered by listening to enemy public radio and later TV. But with the digitization of information and the Internet, OSINT went really into overdrive.
Information is not intelligence
Before we go further, it is necessary to clarify one thing: having information available is no intelligence. What makes information into intelligence is having a purpose when looking at that information. Or to say it another way: if the information you have collected is used to gain an insight or answer a specific question, it becomes intelligence. The usual methods of extracting answers from information is by analyzing it or correlating it with other data.
What are the main OSINT data sources?
In general, as the term “Open Source” suggests, your data sources must be publicly accessible to be qualified as OSINT. This includes the following:
- News Media
- Social Media Platforms
- Websites
- Libraries
- The Dark Web
- Public Records
- Images, Audio or Video
Except, maybe, for the Dark Web, almost everyone reading this knows how to find data in all of the previous sources, using just a web browser. And even the Dark Web isn’t really that difficult.
The information you can collect from the Internet usually is in one of four formats: either text, audio, images or video. There are obviously additional formats like presentations, PDF & databases, but usually you can convert all of those into one of the main four

Who uses OSINT and for what?
In general, OSINT is available for everyone. However, to do complex investigations you obviously need more search power than that available to any private citizen. That is why the collectives that most use OSINT are the following:
- Law Enforcement. Obviously one of the biggest consumers of OSINT intelligence is Law Enforcement. They use it for their investigations, find out relationships between people or track them online, etc.
- Military. Actually, the origins of OSINT were in the military. So today they keep using it to find out possible threats, or information about individuals of specific countries.
- Government. The government can use OSINT to find out more about someone, for any purpose (taxes, legal, etc.) and obviously uses it in its intelligence services.
- Law firms and insurance companies. In this case the motivations can be varied, but having access to OSINT can help solve or establish many legal cases or insurance claims.
- Finance. The finance sector can also benefit from keeping an eye on competitors, specific markets or the public activities of people with the ability to influence markets.
- Investigators. Either journalists or Private Investigators can use OSINT to advance their story or case looking into information about the involved parties.
- Marketing. Analyzing markets, market trends or even trends on social media can give insights into future product or service developments or how to orient a campaign. It also is useful to find out the reputation of a company.
In fact, even you may have used OSINT to find out information about a seller of an item you wanted to buy, or checking social media before hiring someone.
How can you make OSINT smarter?
The main problem with OSINT systems is that they are very good at collecting information and finding relationships, but cannot use every type of data they get. Databases and text files, PDF documents and the like are easy to read and analyze. Audio, images and video, however, are not. They represent a certain type of unstructured data that cannot be analyzed without viewing or hearing it.
And that is where tools like Intelion come into play. They can be integrated in the data acquisition and analysis process and are able to make sense of audio, picture and video files applying different analyzers to them. For example, to find a specific face, or sound, or audio pattern. Intelion also has the ability of transcribing speech-to-text and translating it, or to find objects in a picture or video. All of the findings are converted into metadata (in text format), that are then sent back to the OSINT system that can apply its usual processing techniques to this data, thus expanding the range of media that can be used as actionable data, beyond simple text, this software is intended for use by law enforcement and intelligence agencies.
Conclusion
Most OSINT systems don’t have the capacity to directly process neither audio nor video files and therefore cannot use the information that they contain. With AI platforms like Intelion, metadata is extracted from all of these files following custom rules that can be specified. This gives any OSINT system a new dimension and access to one of the biggest data source on the Internet: video.