The use of big social data

Visualization of big data and network communication
© Getty Images/MF3d

How can data from social platforms be used for science-based social research and what are the associated opportunities and risks? These were the key issues at an on 19 June with AI expert Ingmar Weber, holder of a Humboldt Professorship.

The publicly accessible body of data collected by Facebook and other platforms creates a huge potential for science-based social research. ‘My use of non-traditional data sources enables me to better measure and understand societal phenomena, such as international migration, or ’, said AI expert Ingmar Weber in the opening remarks to his Lab Talk on Alumniportal Deutschland entitled ‘What Advertising Data Tells Us About Society’.

Pioneering work in the field of computational social science

Weber is a pioneer in the interdisciplinary field of research called ‘’, which combines computer science and social sciences. He used the Facebook interface for advertisers to demonstrate to the participating alumni the vast diversity of information that is freely available from the anonymised, aggregated data – including the proportion of female Facebook users in a particular region, individual age structures and the proportion of those with a higher educational qualification. Searches can also reveal the proportion of users with iOS and those with Android as their operating system. Since iOS tends to run on more expensive end devices, this information can be used to draw conclusions regarding the socioeconomic structure in urban districts, regions or entire federal states. Weber vividly illustrated this based on the example of Manhattan, which has an almost 90 per cent proportion of iOS users – whereas Android is predominantly used in The Bronx.

Improving our coexistence with computer science

Improving our coexistence with computer science
Improving our coexistence with computer science ©

(Video on German language)

Research into international migration

Ingmar Weber heads the ‘’ at Saarland University, which was inaugurated in 2023; he previously conducted research at the Qatar Computing Research Institute in Doha. His studies of international migration are particularly pioneering: one aspect of this was his use of Facebook data to investigate the ‘Venezuela exodus’ in 2018 when the inflation rate in this Latin-American country rose by over 130,000 per cent. Millions of people at that time fled from Venezuela to neighbouring countries, but also to Spain and the USA. In his Lab Talk, Weber demonstrated the alteration in the number of active Facebook users in various regions of Columbia who had previously lived in Venezuela. Data regarding migration flows are naturally also available from official bodies, although they are frequently incomplete and not as current as social media data. The users of a specific platform are conversely not of course representative of the total population. Weber’s models merge traditional and non-traditional data sources to seek out their respective weaknesses in a targeted manner and he uses statistical tools to obtain the most plausible results. His calculations are deemed to be more reliable than the official statistics. The UN agency mandated to aid and protect refugees () used Weber’s results to make its humanitarian assistance to those who fled Venezuela more purposeful; the UN Children’s Fund used them to develop regional Facebook campaigns combating xenophobia.

Solutions to societal problems provided by computational social science

The researcher stressed in his Lab Talk that computational social science strives to collaborate with humanitarian and development-related organisations in generating solutions to societal problems. It is often these novel data sources and methods that enable the extent of a problem to be determined in the first place. Some 50 per cent of countries do not for instance hold official data regarding the ‘digital gender gap’, in other words the gulf between genders when it comes to accessing and using digital technologies. Evaluating platform data enables conclusive estimates to be obtained: these reveal that twice as many men as women in Iraq have access to the internet, in Niger it is three times as many. Such calculations create a basis for developing political measures to achieve greater equality.

Risks and challenges relating to the use of data

Despite the many advantages of evaluating ‘big social data’, there are of course also disadvantages and risks, said Weber at the end of his Lab Talk. Not only could the data samples be seen as unrepresentative – users’ behavioural patterns also alter over time. The platforms can moreover be considered to be a black box: researchers have no insight into the criteria by which Facebook, Weibo or Snapchat categorise their users. Weber also referred to the risk that results can be misused – for example by governments to persecute minorities.

Augmenting traditional methods with societal computing

This Lab Talk generated substantial interest among the participating alumni. The discussion included the issue of whether societal computing could replace conventional sociological analyses. ‘Absolutely not!’, replied Ingmar Weber – and referred to the fact that censuses were conducted by the ancient Greeks. He feels that quantitative and qualitative social research complement one another and that numerous surveys on migration cannot replace conversations with migrants regarding their experiences: research requires a variety of methods.

Data security within social networks

If you’re not paying for the product, then you are the product: most social media users are generally aware that their data are the price for these apparently free services – even if very few people actually take the time to read through the terms of use prior to registering. In the case of Facebook, and the like, these state that the user grants the platforms a global, non-exclusive and free of charge licence for using their content. The platforms use the extensive information that they collect regarding behaviour and preferences to create detailed user profiles, which among other things are applied with respect to personalised advertising. Meta, the parent company of Facebook and Instagram, will from the end of June 2024 use posted photos and videos to train its artificial intelligence. Metadata, including the type of end device and the location and point in time when a photo was taken, are also to be included in their user profiles in addition to the actively predetermined content. Many providers record the full movement profile of individuals who remain logged in over a long period.

The platforms have provided new functionality relating to data control and erasure after , , and were sanctioned with fines running into millions due to breaches of privacy. This makes it possible for you to object to Meta’s use of your personal data for its AI training. Experts recommend that we carefully read privacy policies and terms of use and always choose the most stringent privacy settings. It should of course also be the case that we only use secure passwords and limit the number of people to whom contacts and posted content are visible.

Personal data relating to users of the Alumniportal Deutschland (APD) are very well protected. The protection of personal data is the utmost priority for the organisations involved in its implementation (DAAD, Goethe-Institut and Alexander von Humboldt Foundation). Information on the processing of your personal data and your rights in the context of using the is available via the following link: .

* mandatory field