CIOReview | | 9 NOVEMBER 2023The best solution to this problem is to identify the biases in the training set and removing it in advance, but that's also the most expensive, time intensive and close to impossible, given the size of the underlying datathe most expensive, time intensive and close to impossible given the size of the underlying data.Data Poisoning: we have been using deep learning for a long time, but traditionally the training sets were private and curated. Unfortunately, as these GenAI systems are being trained using public data, they are exposed to data poisoning, which means creating false data and posting it publicly to negatively affect the results. For example, research has shown that by getting access to a few old websites and injecting false data in there, it's possible to confuse the systems to produce wrong outcomes. Copyright: The idea of GenAI is that their output has been created, not copied from somewhere else. Unfortunately, in some cases, it seems that the systems are borrowing copyrighted material. For example, Getty Images is suing Stability AI for copyright infringement as they found their watermark in outputs from the system. It is yet unclear what could be the copyright liability of a company using publically the output from one of those systems. In text-based systems, some GenAI systems offer the possibility to check the output for plagiarism and to check what percentage of the text is detected as produced by a machine albeit they are not perfect. Maybe something similar could be used to verify generated images for copyrighted material.Ethics: Large language models don't have any moral or ethics boundaries, as they were built to simply predict what words should follow in a conversation, but without any true understanding of the consequences and with very limited memory. As a consequence, they may suggest doing something alarming. For example, one of the systems suggested a person should commit suicide after a six week long conversation about climate change concerns . In case it's not clear yet, conversational chatbots should not be used as a therapist or as a friend as this will lead to horrible consequences. Long term societal impact: There are potential long term societal impacts, for example, it's possible that GenAI will impact the labor market, displacing workers. Individuals could train themselves to use GenAI, as it is pretty user friendly in most cases, and companies can proactively offer training for employees to up-skill their capabilities. Having said that, addressing the largest impacts will probably fall within the responsibility of public policy. There's also a long term risk around losing incentives to develop certain skills. In a similar way that the widespread introduction calculators made learning how to do mental math less important and thus less common, these GenAI systems may reduce the incentive for people to learn how to write long form content properly. In sum, GenAI has huge potential to drive productivity improvements but also sizable issues that need to be addressed via mitigation strategies to be able to gain the benefits without exposing ourselves and our companies to risks.
< Page 8 | Page 10 >