Google has long been more than just a search tool – it has become an omnipresent cognitive extension of human beings, deeply embedded in our everyday lives and thought processes. The search engine acts as a kind of external memory that can be consulted at any time – whether out of curiosity, uncertainty or for concrete planning.
So it’s no wonder that in the digital age, criminal activities are increasingly being prepared using Google. Targeted information can be extracted from around 1.1 billion websites and countless other online sources – quickly, anonymously and with a wide reach.
However, the close interaction between man and machine not only brings convenience, but also leaves traces – often unconsciously and unintentionally. In digital reconnaissance, even casual or supposedly harmless search queries can provide clues that can be subsequently made visible under certain technical and legal conditions. In this way, Google becomes a silent witness – and sometimes also an investigative tool.
In terms of global usage, the US search engine Google has the largest market share for desktop searches at 79 per cent. For mobile devices, this figure is even around 94 per cent. However, usage can vary significantly depending on the country. While almost every search query in Iran is made via Google, the proportion in Russia is between 22 and 53 per cent, depending on the source.
On average, Google answers around 40,000 search queries per second worldwide.
With Google Trends, Google provides an analysis tool that can be used to visualise search processes over time and with regard to their geographical origin. In addition, search processes that show a similar development in search behaviour for a selected time window can be displayed. Google Trends was designed for digital marketing. With its help, it should be easier to recognise needs and better plan specific marketing measures.
Searches that can be assigned to a group of offenders, accomplices or an individual offender in the forensic context must be reliably identified as such in the plethora of searches. In most cases, the easiest way to do this is to relate the data to a specific time period. If, for example, searches can be observed that relate to a future and unsuspected offence, this implies prior knowledge or an unspecific or specific expectation. In this way, it may be possible to draw conclusions about the origin of these individuals.
In order for this to succeed, we must take into account the restrictions and special features associated with the handling of the data. These are described below.
Google Trends distinguishes between live data and historical data. The more adequate and higher-resolution live data extends 7 days into the past. With the less detailed historical data, we can look back as far as 2004. However, this is hardly suitable for meaningful analyses, especially when demand is low.
To protect personal data, Google Trends only takes into account searches for people that are of public interest. A certain number of independent searches must therefore be registered by Google in advance before the corresponding search histories are also displayed.
Google Trends does not provide absolute values. These are plotted on a scale in the selected time period from 0 to 100 per cent. Several search phrases can be compared with each other over time for an approximate estimate of the order of magnitude.
Google Trends distinguishes between regular searches, which can also take into account different arrangements and context-related additions of individual search terms, and an exact match. In the latter case, only the search processes in exact spelling should be taken into account, as they were formulated within the inverted commas.
To enable Google to assign the origin of a search query, this can be derived on the basis of the IP address or the location of the mobile device. The location can be concealed using various techniques, such as a virtual private network (VPN). However, practical analyses have shown that the ambition to conceal one’s own location depends on the typology of the search term, the offender group and, in turn, the location.
Analyses with Google Trends should be supplemented with verifiable real-world references wherever possible. This means that data whose validity is beyond doubt should also be included, as it correlates with facts that can be verified by anyone. These in turn can be compared with facts that are not directly verifiable.
Google Trends was designed to analyse large volumes of data. This can already be seen in the system architecture, which only takes into account random samples of searches and supplements non-existent data areas with calculated probabilities. This can mean that we do not see all searches even though they have been carried out. On the other hand, this can also result in digital artefacts, i.e. apparent searches that were never actually carried out. For this reason, searches that are expected to have a very low search volume must always be critically scrutinised. As we also receive random samples when querying via Google Trends, the results do not always appear stable over time. They can very quickly be interpreted as non-valid, as digital noise.
To counter this, the search processes to be analysed must not only be checked several times. A temporal cross-validation is required.
„If a search term only has a very low search volume in the period under consideration, small deviations can occur even in closed periods.“
Isabelle Sonnenfeld, Head of Google NewsLab
Analysing individual search processes is like looking at a fragment – often unclear, random or difficult to categorise on its own. Only when several of these seemingly unrelated parts are superimposed on the time axis – like transparent foils in the light of a focussed lamp – does an overall picture emerge that is much more meaningful. For example, a consistent increase in different but thematically related searches can be interpreted as a statistically plausible indication. Cross-validation serves to embed the individual signals in a superordinate pattern – and helps to distinguish valid indications from mere artefacts.
Similar to a composite silhouette, the actual form – in this case a hypothesis with substance – is only revealed when the layers are superimposed. This approach requires care, but it increases the likelihood of recognising patterns that would remain hidden if viewed in isolation.
The situation described can be clearly seen in the following example. The illustration shows that the name ‘Anton Cherepennikov’ was searched for in the same time periods as information on xenon therapy.
The Russian entrepreneur Cherepennikov was found dead after these search operations. The cause of death was determined to be intoxication by xenon gas.
In practice, it has also been shown time and again that the analysis was influenced more by missing data than by ‘confabulated’ data, as the latter could be reduced by cross-validation. Such data gaps tend to invalidate rather than substantiate previously formulated hypotheses.
In most cases, data collection is preceded by the formulation of a hypothesis that is to be refuted or substantiated later on. When potentially conspicuous data is found, a considerable amount of time must therefore be invested in finding alternative reasons for the findings. Only if this is not possible can the data find serve as a starting point for further investigations.
And this is precisely the purpose of the methodology: Google Trends can only provide one aspect of a comprehensive investigation. It is one part and can help to focus the investigation in individual cases.
The clues derived from Google Trends cannot therefore provide evidence in the traditional sense. They can merely pose questions that can contribute to informed decision-making in the investigation.
Steven Broschart has been working intensively with the Google search engine since 2003. He has been using the data and analysis options of Google Trends since 2006. Since 2018, he has focussed in particular on forensic analysis, in which digital traces are compared with real-world events. On this basis – and in conjunction with other data sources – he supports German investigative authorities in solving complex cases.
Google Trends is not as intuitive as it seems at first glance. Its meaningful use requires expertise. Some critics claim that Google Trends data can be misleading, but these claims are often based on misinterpretations rather than inaccuracies in the data itself.
About Google Trends
About the data quality of Google Trends
In a forensic context