NGI ENGINEROOM

Explorations in Next Generation Internet Emerging
EU Engineroom focuses on identifying and evaluating the key enabling technologies and topics that will underpin the Next Generation Internet in 2025.
Engineroom’s three key pillars:

Consortium's partners:

Sources

Unique terms: 0+

Media articles: 0
Working papers: 0

Analysis period: two years and three months

Multiple gigabytes of raw
and processed data

Why ArXiv and SSRN?

  • Lengthy publication process in scientific journals
  • Broad coverage:





  • SSRN's e-Library provides almost 800,000 research papers from 365,000 researchers across 30 disciplines







    ArXiv provides open access to almost 1,500,000 e-prints mostly in STEM fields

    Methodology

    Project Goal and General Idea

    • Major aim is to identify key technologies determining the development of Internet until 2025
    • Strong focus on the relationship between technological areas and social issues
    • Data-driven approach with heterogeneous sources of data

    Trend analysis

    • Analysis based on the frequency of appearances for all unigrams and bigrams in the texts
    • Average monthly change in the analised term's frequency is calculated by OLS regressions
    • The coefficient reveals the trending unigrams and bigrams


    Co-occurrence analysis

    • Exploring the relationship between topics
    • Pairs of terms which are mentioned together in media articles
    • The number of articles containing both terms is divided by the number of articles including our previously identified keyword of interest for every media website

    Issue mapping

    • Articles are categorised across two dimensions: geography (EU vs US) and covered topic (social vs technological)
    • Words are ranked based on their frequency in articles classified as social and non-social (technological)

    Wikipedia network analysis

    • Matching the keywords to Wikipedia articles and parsing their text to extract hyperlinks
    • Generating the network of hyperlinks that connects the articles with one another
    • Using a community detection algorithm (the Louvain method) to identify clusters of nodes

    Main Programming Tools

    Topic identification

    132 most trending NGI related keywords are identified

    Grouped into 21 wider areas
    The size of the bubble is based on the regression coefficient
    Bigger bubble: more robust trend

    Topic co-occurrence

    The goal is to dive deeper in emerging technologies
    Relationship between social issues and technology
    These pairs frequently appear together in articles (news) or are used in comments about a topic (reddit)

    News co-occurrence

    Reddit co-occurrence

    Short topic list

    Click buttons to read short descriptions

    Linked keywords: open internet, net neutrality, personal data, cambridge analytica, identity theft, black box, ai research

    Linked keywords: hate speech, alt-right, extremist content, sexism, gender discrimination, #metoo, child safety, trafficking, parental control, youtube kids, diversity, racism, accessibility, 5G networks, care robots, voice assistants and chatbots, online safety

    Linked keywords: smart contracts, distributed ledgers, facial recognition, digital assistant, voice assistant

    Linked keywords: cybersecurity, ransomware, cyberwar, cyber threats, meltdown, nonpetya, hacking, quantum computing, encryption, critical infrastructure, autonomous weapons, killer robots, equifax

    Linked keywords: Machine learning, deep learning, algorithmic bias, algorithmic accountability, artificial intelligence, black box, open AI, data lakes, transparency

    Linked keywords: election hacking, election meddling, fake news, foreign intelligence, new media, filter bubble, echo chamber, media literacy, weaponisation of information, advertising, cambridge analytica, bots, fake accounts, platform economics, media platform, conspiracy theories

    Linked keywords: privacy, informed consent, smart cities, self-driving cars, facial recognition, surveillance, data brokers

    Linked keywords: open internet, net neutrality, free speech, internet freedom, gig economy, ico, worker's rights, tech giants, distributed ledgers, consumer protection

    Linked keywords: blockchain, cryptocurrency, smart devices, energy efficiency, mining, renewable energy, data storage

    Issue mapping

    Articles are classified in two dimensions: EU/US, social issue/technology

    EU axis: articles from European sources or concerning Europe
    Social issues axis: articles containing words from a pre-defined list of social topics
    Mapping trending words with article type based on no. of occurrences
    Top right corner: EU articles on social issues
    Bottom left corner: US articles on technology

    Charts

    Application to explore trending keywords by source
    Common terms: compare the trend of the keyword across sources

    Trend robustness

    Case study

    Topic clusters around online privacy


    Online privacy is a widely discussed issue within the academia.
    In order to identify main research topics we have done a quick topic modeling exercise.

    First we have web-scraped working papers related to online privacy from the perspective of Social Sciences: SSRN.

    On this dataset we have performed document clustering using tf-idf, multidimensional scaling, k-means and pyLDAvis.


    The preliminary results:

    SSRN topic clusters around online privacy

    Wikipedia

    Network around NGI related technologies

    Network around NGI related social issues and values keywords

    About

    EU ENGINEROOM has received funding from the European Union's Horizon 2020 research and innovation programme under the Grant Agreement no 780643. The content of this website does not represent the opinion of the European Union, and the European Union is not responsible for any use that might be made of such content.