





Subcategories included in the analysis:
cs.AI Artificial Intelligence
cs.CL Computation and Language
cs.CC Computational Complexity
cs.CE Computational Engineering, Finance, and Science
cs.CG Computational Geometry
cs.GT Computer Science and Game Theory
cs.CV Computer Vision and Pattern Recognition
cs.CY Computers and Society
cs.CR Cryptography and Security
cs.DS Data Structures and Algorithms
cs.DB Databases
cs.DL Digital Libraries
cs.DM Discrete Mathematics
cs.DC Distributed, Parallel, and Cluster Computing
cs.ET Emerging Technologies
cs.FL Formal Languages and Automata Theory
cs.GL General Literature
cs.GR Graphics
cs.AR Hardware Architecture
cs.HC HumanComputer Interaction
cs.IR Information Retrieval
cs.IT Information Theory
cs.LG Learning
cs.LO Logic in Computer Science
cs.MS Mathematical Software
cs.MA Multiagent Systems
cs.MM Multimedia
cs.NI Networking and Internet Architecture
cs.NE Neural and Evolutionary Computing
cs.NA Numerical Analysis
cs.OS Operating Systems
cs.OH Other Computer Science
cs.PF Performance
cs.PL Programming Languages
cs.RO Robotics
cs.SI Social and Information Networks
cs.SE Software Engineering
cs.SD Sound
cs.SC Symbolic Computation
cs.SY Systems and Control
Trend analysis
Topic filtering
Following the steps discussed in the methodologies section, the lists of unigrams and bigrams are sorted by the regression coefficient and NGI-related terms are manually selected. The process is the following:News media
Filtering process
Click buttons to read short descriptions
|
Linked keywords: open internet, net neutrality, personal data, cambridge analytica, identity theft, black box, ai research
|
Linked keywords: hate speech, alt-right, extremist content, sexism, gender discrimination, #metoo, child safety, trafficking, parental control, youtube kids, diversity, racism, accessibility, 5G networks, care robots, voice assistants and chatbots, online safety |
Linked keywords: smart contracts, distributed ledgers, facial recognition, digital assistant, voice assistant |
|
Linked keywords: cybersecurity, ransomware, cyberwar, cyber threats, meltdown, nonpetya, hacking, quantum computing, encryption, critical infrastructure, autonomous weapons, killer robots, equifax |
Linked keywords: Machine learning, deep learning, algorithmic bias, algorithmic accountability, artificial intelligence, black box, open AI, data lakes, transparency |
Linked keywords: election hacking, election meddling, fake news, foreign intelligence, new media, filter bubble, echo chamber, media literacy, weaponisation of information, advertising, cambridge analytica, bots, fake accounts, platform economics, media platform, conspiracy theories |
|
Linked keywords: privacy, informed consent, smart cities, self-driving cars, facial recognition, surveillance, data brokers |
Linked keywords: open internet, net neutrality, free speech, internet freedom, gig economy, ico, worker's rights, tech giants, distributed ledgers, consumer protection |
Linked keywords: blockchain, cryptocurrency, smart devices, energy efficiency, mining, renewable energy, data storage |
The Cambridge Analytica revelations have made it abundantly clear how little control we have as consumers over our own personal data, and the way the internet operates more generally. Building solutions and new models that allow citizens to understand the increasingly more complex processes behind dominant internet applications should therefore be a key component of a more democratic NGI. Experimentation with encrypted data boxes and data commons, as well as fostering more citizen involvement in, for example, internet governance processes and technology development would need to be a part of this, as would moving production of these technologies back into Europe.
Before we can talk about building a next generation internet, we need to ensure that all Europeans can have access to the current generation of the internet - that means investing in infrastructure, multilingual and accessible tools (targeting for example less tech savvy or lesser-abled users), but also creating a safe environment for all (particularly as online harassment and hostility particularly affect more vulnerable groups). Increasing diversity in who gets to build and use the internet is important if we want to ensure we don’t perpetuate existing inequalities also in the digital economy, but also helps stimulate innovation, as diverse teams tend to be more creative.
If Europe were able to build a trustworthy and secure system for managing online identities, offering an e-ID to every European citizen (not unlike the incredibly successful Estonian model), the continent would be able to take a massive leap ahead in strengthening e-commerce and other relationship-based online interactions. Effective identity management wouldn’t only increase trust on the internet (who am I really talking to? Can I trust the online service?) and so bolster the European digital economy, but would also help us build more personal online relationships. The currently dominant rate-and-review system places a lot of power in the hands of the reviewer (a single low score on ride sharing app can seriously damage a driver’s ability to attract new customers), e-identity systems could make these interactions more positive and equal.
One key component of building a more democratic and inclusive Next Generation Internet is ensuring the infrastructures underpinning the internet itself are secure, safe and resilient. We live in a time of growing cyber threats: from rising cyber crime to ever more sophisticated cyber warfare capabilities. Existing weaknesses and flaws in the internet’s physical infrastructure and protocols also require urgent mending. Governments, the private sector and citizens need access to the right tools and information to help them protect themselves against these kinds of threats, and larger systems changes are required to ensure our (critical) infrastructures are resilient in the face of merging challenges such as quantum computing-enabled cracking of encryption.
As discussions about the potential transformative impact of AI and Machine Learning have come to dominate public debate in recent years, so have concerns about the potential negative side-effects of allowing these kinds of technologies to play an ever-larger role in decision-making and the governing of our societies. The development of ethical AI and ML tools doesn’t only involve the use of responsibly managed data (make sure we have a representative sample, privacy and anonymity is ensured) and algorithms that don’t further existing societal biases (around gender and ethnicity, for example), but also that the tools themselves are used for purposes we consider ethically just. Ensuring we have solutions that are fair and inclusive along the value chain (from data generation to the impact of the decisions being made or tasks replaced).
The proliferation of “fake news” and the weaponisation of information is a key challenge for the internet today, threatening the fundaments of our democracies and even societies. Ensuring access to trustworthy information, and preventing the deliberate manipulation of information flows without resorting to censorship and hampering of freedom of speech remains an unsolved challenge however. Under this topic, we would explore potential solutions for specific issues such as fake news bots and preventing filter bubbles, but also take a wider view in trying to strengthen (social) media ecosystems and exploring alternative sustainable business models for quality news and information provision.
With the internet becoming ever more pervasive in our lives and societies, shaping our jobs, our cities, our interactions with the government and so forth, it has become harder and harder for individuals to shape our relationships with, or opt out of “the internet” altogether. With the rise of the smart city, and the millions of connected IoT-devices that will underpin it tracking our every move, how do we ensure citizens can meaningfully consent to what happens with the data they generate, and retain their privacy? With everything from our smart vacuums to credit card companies collecting and selling our data to the highest bidder (through very opaque processes), we need new solutions that help citizens give informed consent, as well as the ability to completely opt out of being part of, for example, data sharing systems, while still being able to use key services.
Most of the issues the internet faces today are a direct consequence of the increased monopolization of the internet, and the business models that sustain this dynamic- and when not, at the very least are more complicated to address because so few actors have the power to do so. We urgently require new business models that can provide an alternative to the reigning advertising-supported model, and can sustain a more pluralistic and healthy digital economy. Alternative models, such as platform cooperativism or blockchain-enabled micropayments systems can help empower smaller players, help level the playing field and offer better protections to consumers and digital workers. We need to support initiatives and SMEs that operate under these alternative models, through policy (protecting net neutrality, designing next generation competition and antitrust policy) and funding support.
One key challenge for the internet moving forward is ensuring that the hardware and infrastructures underpinning it are sustainable and can meaningfully contribute to building a more circular economy. The challenges around the internet’s environmental footprint are myriad: from the extraordinary amount of energy used by data centres and emerging technologies like blockchain, to the costly mining processes behind the materials making our tech devices function. Though there is a growing recognition of these issues, we do not yet see enough solutions- this is we think a space where Europe can start to play an important front-runner role.
Issue mapping
The results of the mapping excercise suggest that European sources and Europeans in general are more concerned about the social aspects of new technologies (see: the crowded diagonal line of the chart). Nevertheless, some meaningful exceptions (outliers) might be pointed out. E.g. cluster of terms related to net neutrality issues is discussed more intensively in the US media, while robotic operating systems and quantum computing technologies are more popular topics in the European sources.
Charts
This section compares the trends of terms across different sources provides interesting insights. Tech journalists and scientists have different focus in their analyses, which is well reflected by such words as ai or machine learning. These more general words have a robust trend in popular news sources, with much less significant increase in academic sources. On the other hand, scientists do research on smaller sub-fields in the field of artificial intelligence, such as learning models. The charts also show the difference in social science (SSRN) and computer science (Arxiv) research fields. As an example, the term “discriminatory” has been on a stronger trend in abstracts of working papers published in SSRN than in Arxiv.
Trend Robustness
Online privacy is a widely discussed issue within the academia. In order to identify main research topics we have done a quick topic modeling exercise. First we have web-scraped working papers related to online privacy from the perspective of Social Sciences: SSRN. On this dataset we have performed document clustering using tf-idf, multidimensional scaling, k-means and pyLDAvis.
Online privacy case study
Topic models are of great help in automating the analysis process of exploring the structure of a large set of documents by clustering documents based on the words that occur in them. It is assumed that documents on similar subjects tend to use a similar vocabulary. Topic modeling is one of the most powerful techniques in text mining used for discovering the latent topics that occur in collection of text documents and examining the relationships among them (Jelodar et al. 2017). Visualization explanation: the bubbles represent topics identified by the LDA. The number of topics is arbitrarily set to 15. The positions of bubbles are determined by computing the distance between topics, and then the inter-topic distances are projected onto two dimensions (see: Chuang et al., 2012). The topic’s prevalence is shown by the size of the bubble. In the right panel, bars represent the individual terms that help interpreting the currently selected topic on the left. A pair of overlying bars represent both the corpus-wide frequency of a given term as well as the topic-specific frequency of the term (see: Chuang et al., 2012). E.g. the currently marked (active - red) topic depicts the cluster of working papers related to online privacy.