A Special Algorithm To Detect The Fake News Is Created by Polish Scientists

Dr. Piotr Przybyła is working on the development of a very special algorithm. Based on the stylistic features of a text, this algorithm could detect whether news is fake or manipulated. His team wants to detect not only the fake news, but also bots on social media.

Algorithms that detect manipulated or harmful content are not new; they are already being used by social media such as Facebook and Twitter. However, large corporations do not share information on how the algorithms work.

The team has built an algorithm that is largely innovative, because until now scientists have focused on analysing the veracity of the facts given in the content. Dr. Przybyła says that it is worthwhile looking at the style of texts made available online, in the form of news articles and posts on social media.

“We want to see what the efficiency of document credibility assessment is based on purely stylistic features,” he added.

He emphasises that their goal is to create an algorithm that not only detects fake news (which is the most glaring example of manipulated content), but also other propaganda techniques and bots.

How was the algorithm created?

First, the team collected a large database of English-language texts (about 100,000), which come from fact-checking organisations. At the same time, the algorithm received information on what features to use to distinguish between reliable and unreliable texts.

“Our machine learning model learns by itself - we give it input data with specific labelling and the features that describe that data. Then it is up to the algorithm to make a decision about linking features with reliability,” he describes.

Przybyła points out that many unreliable texts on English-language media concern the political polarisation in the US. Many of them feature the names of Presidents Donald Trump and Barack Obama. Therefore, in order for the algorithm to work more effectively and not be “biased” by such words, Przybyła removed them from the texts provided to the algorithm. He hopes that in this way the data submitted for further analysis will be more objectified; the algorithm will receive information that, for example, a sentence consists of an adjective, noun, adverb and verb, and thus will be blind to the information that researchers want to filter out because they disrupt the algorithm’s work.

The researchers imposed categories of words on the algorithm to make it easier to control. Three main stylistic categories of unreliable information were observed:

Words that describe judgement and concern values and moral goals
Words that describe power, respect, and influence
Words that are strongly influenced by emotions, both positive and negative.

In turn, reliable texts cite other sources and present numerous data.

“Of course, this is a great simplification, because in total we distinguished over 900 features that guide our algorithm,” he adds.

Dr. Przybyła focused on testing the method for the English language, because all researchers in this field know it. “It is also easier to access a large amount of well-prepared and proven data, which improves our work,” he notes. Only then, when the model assumptions turn out to be correct, will it be possible to create an analogous algorithm for other languages, including Polish.

The algorithm is already able to perform at an efficiency of 80-90%. However, this level of efficiency is not satisfactory for the researcher, so he and his team are still working to improve it.

According to Dr. Przybyła, it is not worth combining this algorithm with others to create a “super-algorithm.” “The user should know on what basis the machine makes decisions. The process should be transparent,” he says.

Przybyła is against automating the operation of the algorithm (for example, blocking the user from accessing content the algorithm considers false). He believes this decision should be made by the person himself.

The project is financed by the National Agency for Academic Exchange under the programme “Polish Returns.” The aim of the Polish Returns Programme is to allow prominent Polish scientists to return to their home country and find employment in Polish higher education institutions, scientific units or research institutes.

Source https://www.gov.pl/web/nauka/algorytm-wykryje-fake-newsy-na-podstawie-stylu

By NAWA

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31