{ "cells": [ { "metadata": {}, "cell_type": "markdown", "source": "# Setup", "id": "94ccd7635ea1a327" }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T19:55:56.986897Z", "start_time": "2025-05-24T19:55:56.391751Z" } }, "cell_type": "code", "source": [ "import os\n", "from dotenv import load_dotenv\n", "from langchain_huggingface import HuggingFaceEmbeddings\n", "from langchain_community.vectorstores import SupabaseVectorStore\n", "from supabase.client import Client, create_client\n", "\n", "load_dotenv()" ], "id": "d3a0e620344270b7", "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 1 }, { "metadata": {}, "cell_type": "markdown", "source": "### Question data", "id": "d0cc4adf" }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T19:56:02.747070Z", "start_time": "2025-05-24T19:56:02.741775Z" } }, "cell_type": "code", "source": [ "# Load metadata.jsonl\n", "import json\n", "# Load the metadata.jsonl file\n", "with open('metadata.jsonl', 'r') as jsonl_file:\n", " json_list = list(jsonl_file)\n", "\n", "json_QA = []\n", "for json_str in json_list:\n", " json_data = json.loads(json_str)\n", " json_QA.append(json_data)" ], "id": "14e3f417", "outputs": [], "execution_count": 2 }, { "cell_type": "code", "id": "5e2da6fc", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T19:56:03.067990Z", "start_time": "2025-05-24T19:56:03.064261Z" } }, "source": [ "# randomly select 3 samples\n", "# {\"task_id\": \"c61d22de-5f6c-4958-a7f6-5e9707bd3466\", \"Question\": \"A paper about AI regulation that was originally submitted to arXiv.org in June 2022 shows a figure with three axes, where each axis has a label word at both ends. Which of these words is used to describe a type of society in a Physics and Society article submitted to arXiv.org on August 11, 2016?\", \"Level\": 2, \"Final answer\": \"egalitarian\", \"file_name\": \"\", \"Annotator Metadata\": {\"Steps\": \"1. Go to arxiv.org and navigate to the Advanced Search page.\\n2. Enter \\\"AI regulation\\\" in the search box and select \\\"All fields\\\" from the dropdown.\\n3. Enter 2022-06-01 and 2022-07-01 into the date inputs, select \\\"Submission date (original)\\\", and submit the search.\\n4. Go through the search results to find the article that has a figure with three axes and labels on each end of the axes, titled \\\"Fairness in Agreement With European Values: An Interdisciplinary Perspective on AI Regulation\\\".\\n5. Note the six words used as labels: deontological, egalitarian, localized, standardized, utilitarian, and consequential.\\n6. Go back to arxiv.org\\n7. Find \\\"Physics and Society\\\" and go to the page for the \\\"Physics and Society\\\" category.\\n8. Note that the tag for this category is \\\"physics.soc-ph\\\".\\n9. Go to the Advanced Search page.\\n10. Enter \\\"physics.soc-ph\\\" in the search box and select \\\"All fields\\\" from the dropdown.\\n11. Enter 2016-08-11 and 2016-08-12 into the date inputs, select \\\"Submission date (original)\\\", and submit the search.\\n12. Search for instances of the six words in the results to find the paper titled \\\"Phase transition from egalitarian to hierarchical societies driven by competition between cognitive and social constraints\\\", indicating that \\\"egalitarian\\\" is the correct answer.\", \"Number of steps\": \"12\", \"How long did this take?\": \"8 minutes\", \"Tools\": \"1. Web browser\\n2. Image recognition tools (to identify and parse a figure with three axes)\", \"Number of tools\": \"2\"}}\n", "\n", "import random\n", "# random.seed(42)\n", "random_samples = random.sample(json_QA, 1)\n", "for sample in random_samples:\n", " print(\"=\" * 50)\n", " print(f\"Task ID: {sample['task_id']}\")\n", " print(f\"Question: {sample['Question']}\")\n", " print(f\"Level: {sample['Level']}\")\n", " print(f\"Final Answer: {sample['Final answer']}\")\n", " print(f\"Annotator Metadata: \")\n", " print(f\" ├── Steps: \")\n", " for step in sample['Annotator Metadata']['Steps'].split('\\n'):\n", " print(f\" │ ├── {step}\")\n", " print(f\" ├── Number of steps: {sample['Annotator Metadata']['Number of steps']}\")\n", " print(f\" ├── How long did this take?: {sample['Annotator Metadata']['How long did this take?']}\")\n", " print(f\" ├── Tools:\")\n", " for tool in sample['Annotator Metadata']['Tools'].split('\\n'):\n", " print(f\" │ ├── {tool}\")\n", " print(f\" └── Number of tools: {sample['Annotator Metadata']['Number of tools']}\")\n", "print(\"=\" * 50)" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "==================================================\n", "Task ID: ad2b4d70-9314-4fe6-bfbe-894a45f6055f\n", "Question: Eva Draconis has a personal website which can be accessed on her YouTube page. What is the meaning of the only symbol seen in the top banner that has a curved line that isn't a circle or a portion of a circle? Answer without punctuation.\n", "Level: 3\n", "Final Answer: War is not here this is a land of peace\n", "Annotator Metadata: \n", " ├── Steps: \n", " │ ├── 1. By googling Eva Draconis youtube, you can find her channel.\n", " │ ├── 2. In her about section, she has written her website URL, orionmindproject.com.\n", " │ ├── 3. Entering this website, you can see a series of symbols at the top, and the text \"> see what the symbols mean here\" below it.\n", " │ ├── 4. Reading through the entries, you can see a short description of some of the symbols.\n", " │ ├── 5. The only symbol with a curved line that isn't a circle or a portion of a circle is the last one.\n", " │ ├── 6. Note that the symbol supposedly means \"War is not here, this is a land of peace.\"\n", " ├── Number of steps: 6\n", " ├── How long did this take?: 30 minutes.\n", " ├── Tools:\n", " │ ├── 1. A web browser.\n", " │ ├── 2. A search engine.\n", " │ ├── 3. Access to YouTube\n", " │ ├── 4. Image recognition tools\n", " └── Number of tools: 4\n", "==================================================\n" ] } ], "execution_count": 3 }, { "cell_type": "code", "id": "4bb02420", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:20.514462Z", "start_time": "2025-05-24T20:01:18.267225Z" } }, "source": [ "### build a vector database based on the metadata.jsonl\n", "# https://python.langchain.com/docs/integrations/vectorstores/supabase/\n", "\n", "embeddings = HuggingFaceEmbeddings(model_name=\"sentence-transformers/all-mpnet-base-v2\") # dim=768\n", "\n", "supabase_url = os.environ.get(\"SUPABASE_URL\")\n", "supabase_key = os.environ.get(\"SUPABASE_SERVICE_KEY\")\n", "supabase: Client = create_client(supabase_url, supabase_key)" ], "outputs": [], "execution_count": 15 }, { "cell_type": "code", "id": "a070b955", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:26.037828Z", "start_time": "2025-05-24T20:01:21.793806Z" } }, "source": [ "# wrap the metadata.jsonl's questions and answers into a list of document\n", "from langchain.schema import Document\n", "docs = []\n", "for sample in json_QA:\n", " content = f\"Question : {sample['Question']}\\n\\nFinal answer : {sample['Final answer']}\"\n", " doc = {\n", " \"content\" : content,\n", " \"metadata\" : {\n", " \"source\" : sample['task_id']\n", " },\n", " \"embedding\" : embeddings.embed_query(content),\n", " }\n", " docs.append(doc)" ], "outputs": [], "execution_count": 16 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:26.050013Z", "start_time": "2025-05-24T20:01:26.047073Z" } }, "cell_type": "code", "source": "docs[0].keys()", "id": "94ad3a742a0f9ad6", "outputs": [ { "data": { "text/plain": [ "dict_keys(['content', 'metadata', 'embedding'])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 17 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:26.079523Z", "start_time": "2025-05-24T20:01:26.074105Z" } }, "cell_type": "code", "source": "docs[0]", "id": "366b7f9e2200fddd", "outputs": [ { "data": { "text/plain": [ "{'content': 'Question : A paper about AI regulation that was originally submitted to arXiv.org in June 2022 shows a figure with three axes, where each axis has a label word at both ends. Which of these words is used to describe a type of society in a Physics and Society article submitted to arXiv.org on August 11, 2016?\\n\\nFinal answer : egalitarian',\n", " 'metadata': {'source': 'c61d22de-5f6c-4958-a7f6-5e9707bd3466'},\n", " 'embedding': [-0.002634669654071331,\n", " 0.02306019514799118,\n", " -0.01757298782467842,\n", " -0.013291980139911175,\n", " -0.020390372723340988,\n", " -0.016542401164770126,\n", " 0.040015820413827896,\n", " 0.017529122531414032,\n", " 0.023523375391960144,\n", " -0.028402695432305336,\n", " 0.05848900228738785,\n", " 0.03848639503121376,\n", " -0.036033838987350464,\n", " 0.05896377190947533,\n", " -0.02319096401333809,\n", " -0.04314032942056656,\n", " 0.0205579474568367,\n", " 0.042271874845027924,\n", " -0.015511706471443176,\n", " 0.011203325353562832,\n", " -0.02394339069724083,\n", " 0.008462891913950443,\n", " 0.034201279282569885,\n", " 0.014322657138109207,\n", " 0.032600849866867065,\n", " 0.018118279054760933,\n", " 0.03146226331591606,\n", " -0.011441514827311039,\n", " 0.024735553190112114,\n", " -0.01013961248099804,\n", " 0.06563965231180191,\n", " 0.06661009788513184,\n", " 0.010979978367686272,\n", " 0.022725064307451248,\n", " 2.0829186269111233e-06,\n", " -0.031171612441539764,\n", " -0.01904902420938015,\n", " 0.015555020421743393,\n", " 0.047359585762023926,\n", " -0.0333370566368103,\n", " -0.005494444631040096,\n", " -0.02123098075389862,\n", " -0.03431028872728348,\n", " -0.002660144818946719,\n", " -0.014628139324486256,\n", " -0.013374630361795425,\n", " 0.0523875467479229,\n", " -0.0076402281410992146,\n", " -0.032536085695028305,\n", " -0.03356455639004707,\n", " 0.01338665746152401,\n", " -0.050998616963624954,\n", " -0.02948424220085144,\n", " -0.010373082011938095,\n", " -0.0038178032264113426,\n", " 0.017957346513867378,\n", " 0.01585584320127964,\n", " -0.006808921694755554,\n", " -0.008414482697844505,\n", " 0.07390625774860382,\n", " -0.036274828016757965,\n", " 0.02431362494826317,\n", " 0.008123979903757572,\n", " -0.0169353187084198,\n", " 0.10557392984628677,\n", " 0.0014651010278612375,\n", " 0.007015349809080362,\n", " -0.031939949840307236,\n", " 0.004718571901321411,\n", " 0.050509411841630936,\n", " 0.19570530951023102,\n", " 0.032000474631786346,\n", " 0.02874969132244587,\n", " 0.03275604173541069,\n", " -0.037697359919548035,\n", " 0.027806980535387993,\n", " 0.013025457970798016,\n", " -0.016645951196551323,\n", " -0.0022332428488880396,\n", " -0.05041703209280968,\n", " 0.031642232090234756,\n", " 0.07753070443868637,\n", " -0.030744211748242378,\n", " 0.039764177054166794,\n", " -0.07469252496957779,\n", " 0.06995853036642075,\n", " -0.015405813232064247,\n", " 0.04118141531944275,\n", " -0.07314331084489822,\n", " -0.04045465216040611,\n", " -0.01070796325802803,\n", " -0.011583046987652779,\n", " 0.04243302345275879,\n", " 0.003167223185300827,\n", " 0.03560613468289375,\n", " 0.0006528274388983846,\n", " 0.017188072204589844,\n", " -0.028718432411551476,\n", " 0.04235910624265671,\n", " 0.008073941804468632,\n", " -0.021448951214551926,\n", " 0.0021353664342314005,\n", " 0.06783191114664078,\n", " 0.04190478101372719,\n", " 0.08950210362672806,\n", " -0.015146220102906227,\n", " 0.025773515924811363,\n", " 0.025072839111089706,\n", " 0.0004630378680303693,\n", " 0.03552567958831787,\n", " 0.00731520913541317,\n", " 0.005610407330095768,\n", " -0.01574847847223282,\n", " 0.019460951909422874,\n", " 0.02298450842499733,\n", " -0.030836662277579308,\n", " -0.01115537341684103,\n", " 0.003158163744956255,\n", " -0.027984654530882835,\n", " 0.010117919184267521,\n", " 0.008425284177064896,\n", " 0.007076774723827839,\n", " -0.010571558028459549,\n", " 0.003734617494046688,\n", " -0.04684477299451828,\n", " 0.005750927142798901,\n", " -0.048839982599020004,\n", " 0.014760443940758705,\n", " 0.022710323333740234,\n", " -0.0036049659829586744,\n", " 0.012393668293952942,\n", " -0.005000544246286154,\n", " 0.013414066284894943,\n", " -0.0019753400702029467,\n", " -0.029460519552230835,\n", " 0.006208211649209261,\n", " 0.04327663034200668,\n", " -0.03282609209418297,\n", " 0.04128726199269295,\n", " -0.019187811762094498,\n", " -0.0217770766466856,\n", " -0.014756876043975353,\n", " -0.03367876634001732,\n", " 0.013694753870368004,\n", " 0.04465927183628082,\n", " 0.016151439398527145,\n", " -0.010275059379637241,\n", " -0.05972592532634735,\n", " -0.0006707482971251011,\n", " 0.007330778986215591,\n", " -0.026784345507621765,\n", " 0.01738477125763893,\n", " -0.011121990159153938,\n", " -0.00012587881064973772,\n", " 0.06454270333051682,\n", " 0.011582930572330952,\n", " 0.0014504488790407777,\n", " -0.024454830214381218,\n", " 0.04665408283472061,\n", " -0.058933790773153305,\n", " 0.05982406437397003,\n", " -0.0017024473054334521,\n", " 0.06395535171031952,\n", " -0.028132759034633636,\n", " -0.0041860067285597324,\n", " -0.004693467635661364,\n", " 0.010822162963449955,\n", " 0.012119379825890064,\n", " -0.09304259717464447,\n", " -0.007465425878763199,\n", " 0.006266313139349222,\n", " 0.037737902253866196,\n", " 0.001667196862399578,\n", " -0.011089840903878212,\n", " 0.00018215955060441047,\n", " 0.07960068434476852,\n", " -0.023905958980321884,\n", " 0.006395317148417234,\n", " -0.06851709634065628,\n", " -0.0015304097905755043,\n", " -0.03926253691315651,\n", " -0.01127888634800911,\n", " -0.018586838617920876,\n", " 0.010915996506810188,\n", " 0.004558028653264046,\n", " -0.003672333899885416,\n", " -0.022555243223905563,\n", " -0.08584657311439514,\n", " -0.002170275431126356,\n", " -0.046697210520505905,\n", " -0.03249267488718033,\n", " 0.0024579528253525496,\n", " 0.03412649407982826,\n", " -0.02661522664129734,\n", " 0.04064067825675011,\n", " -0.013811669312417507,\n", " -0.007690237369388342,\n", " -0.06591427326202393,\n", " -0.01812717691063881,\n", " 0.057792723178863525,\n", " -0.029163463041186333,\n", " -0.025660233572125435,\n", " 0.05507468804717064,\n", " 0.004354835953563452,\n", " -0.014392134733498096,\n", " -0.02543715015053749,\n", " 0.037940073758363724,\n", " -0.0036121318116784096,\n", " 0.024686165153980255,\n", " 0.08713246136903763,\n", " 0.11850911378860474,\n", " 0.014539483934640884,\n", " -0.10350408405065536,\n", " -0.0011429855367168784,\n", " -0.04701121151447296,\n", " 0.004175751470029354,\n", " 0.004833280108869076,\n", " 0.02364792488515377,\n", " -0.05049735680222511,\n", " -0.01391701027750969,\n", " -0.005414949264377356,\n", " 0.04862183704972267,\n", " -0.0471501424908638,\n", " -0.015768753364682198,\n", " -0.034204281866550446,\n", " 0.024975717067718506,\n", " -0.02073042094707489,\n", " -0.005536919925361872,\n", " -0.002433228772133589,\n", " -0.019266029819846153,\n", " -0.027966758236289024,\n", " 0.0505487360060215,\n", " -0.00184740056283772,\n", " -0.01213296689093113,\n", " -0.03668725863099098,\n", " -0.0016338287387043238,\n", " 0.0092572420835495,\n", " 0.09900709986686707,\n", " -0.039930928498506546,\n", " -0.0662119910120964,\n", " 0.013620669022202492,\n", " 0.04497472196817398,\n", " 0.06800892949104309,\n", " 0.005037286318838596,\n", " 0.057961344718933105,\n", " 0.010218670591711998,\n", " -0.01156136766076088,\n", " -0.04696835204958916,\n", " -0.01949932426214218,\n", " 0.025157760828733444,\n", " 0.014092723838984966,\n", " -0.028074178844690323,\n", " -0.026051081717014313,\n", " -0.018629243597388268,\n", " 0.009198138490319252,\n", " -0.006104218307882547,\n", " -0.08197355270385742,\n", " 0.015044000931084156,\n", " 0.04467941075563431,\n", " -0.014608283527195454,\n", " 0.08819227665662766,\n", " 0.007509428542107344,\n", " -0.010414468124508858,\n", " 0.007954941131174564,\n", " -0.0069016460329294205,\n", " -0.010237660259008408,\n", " -0.017175210639834404,\n", " 0.06531468033790588,\n", " 0.010659008286893368,\n", " 0.006931719370186329,\n", " -0.035468921065330505,\n", " -0.04003450646996498,\n", " -0.03174721822142601,\n", " -0.003177952952682972,\n", " 0.013706344179809093,\n", " 0.04051642119884491,\n", " 0.0406000129878521,\n", " 0.02013508975505829,\n", " -0.011152448132634163,\n", " 0.04112039506435394,\n", " -0.010956496000289917,\n", " -0.027334880083799362,\n", " -0.01467766985297203,\n", " 0.007155789062380791,\n", " -0.031312908977270126,\n", " -0.006655262317508459,\n", " 0.033579859882593155,\n", " 0.013021055608987808,\n", " -0.017468370497226715,\n", " -0.020633157342672348,\n", " -0.08207091689109802,\n", " -0.01639452949166298,\n", " -0.007760942447930574,\n", " 0.004679321777075529,\n", " 0.008204430341720581,\n", " 0.04054716229438782,\n", " -0.0024841029662638903,\n", " 0.01282365620136261,\n", " 0.023797696456313133,\n", " 0.07248759269714355,\n", " 0.008883222006261349,\n", " 0.06543925404548645,\n", " 0.00713002122938633,\n", " -0.03407192602753639,\n", " -0.04935158044099808,\n", " 0.04824000969529152,\n", " 0.06606534868478775,\n", " -0.03373231366276741,\n", " 0.013552390038967133,\n", " 0.048465143889188766,\n", " 0.04611079394817352,\n", " 0.0007207902963273227,\n", " -0.03279798850417137,\n", " 0.05367812514305115,\n", " -0.0400458499789238,\n", " -0.0030804122798144817,\n", " -0.053869858384132385,\n", " -0.0165878739207983,\n", " -0.0013762515736743808,\n", " 0.028720157220959663,\n", " 0.03315956890583038,\n", " 0.009932303801178932,\n", " -0.02591826394200325,\n", " 0.04119397699832916,\n", " 0.024423252791166306,\n", " -0.01496956404298544,\n", " -0.0323435477912426,\n", " -0.006731743924319744,\n", " 0.029446495696902275,\n", " 0.007287610322237015,\n", " 0.007918152958154678,\n", " -0.010146072134375572,\n", " -0.02243207022547722,\n", " -0.01578712835907936,\n", " 0.021029800176620483,\n", " 0.015946924686431885,\n", " -0.06597908586263657,\n", " 0.00529048265889287,\n", " -0.040850117802619934,\n", " 0.014579784125089645,\n", " -0.04942426085472107,\n", " -0.04702606797218323,\n", " 0.01150251179933548,\n", " -0.0040593561716377735,\n", " 0.04321623221039772,\n", " 0.059940144419670105,\n", " 0.02743799053132534,\n", " -0.023821605369448662,\n", " -0.036525800824165344,\n", " 0.0227236095815897,\n", " -0.0036396614741533995,\n", " 0.06745504587888718,\n", " -0.03718177601695061,\n", " 0.017290111631155014,\n", " -0.0517282634973526,\n", " 0.01951291225850582,\n", " 0.04273047298192978,\n", " 0.03246920183300972,\n", " -0.010021774098277092,\n", " -0.024396777153015137,\n", " -0.045648202300071716,\n", " 0.049896176904439926,\n", " 0.005997159983962774,\n", " 0.043423186987638474,\n", " 0.03193819150328636,\n", " -0.024819960817694664,\n", " 0.02547491528093815,\n", " -0.02625952661037445,\n", " 0.016126662492752075,\n", " -0.024077417328953743,\n", " -0.00981255155056715,\n", " -0.01745862513780594,\n", " 0.029219385236501694,\n", " 0.025041963905096054,\n", " -0.0028992712032049894,\n", " 0.00762314023450017,\n", " 0.0104047367349267,\n", " 0.024380790069699287,\n", " -0.022016024217009544,\n", " 0.057698704302310944,\n", " 0.0028469872195273638,\n", " -0.05271538347005844,\n", " 0.08039356768131256,\n", " -0.008950183168053627,\n", " -0.03587312996387482,\n", " -0.008277269080281258,\n", " -0.04787948355078697,\n", " -0.03413078933954239,\n", " 0.07828041166067123,\n", " -0.015157248824834824,\n", " 0.013794142752885818,\n", " -0.04388789087533951,\n", " -0.04338032007217407,\n", " -0.00860451441258192,\n", " 0.03069683164358139,\n", " 0.019598200917243958,\n", " -0.04912331700325012,\n", " -0.021822135895490646,\n", " 0.021404210478067398,\n", " 0.008655610494315624,\n", " 0.012699274346232414,\n", " 0.020988276228308678,\n", " 0.03923216462135315,\n", " 0.023639455437660217,\n", " -0.04992711916565895,\n", " 0.020371155813336372,\n", " -0.05075705796480179,\n", " 0.04057437926530838,\n", " -0.07911901921033859,\n", " -0.013172636739909649,\n", " 0.0148845249786973,\n", " -0.009725064039230347,\n", " -0.04076891392469406,\n", " -0.009122541174292564,\n", " -0.014479629695415497,\n", " -0.041023194789886475,\n", " -0.018188422545790672,\n", " -0.025831039994955063,\n", " -0.009083966724574566,\n", " 0.011243901215493679,\n", " 0.01151696965098381,\n", " -0.008685738779604435,\n", " 0.022748306393623352,\n", " -0.023368874564766884,\n", " -0.041385382413864136,\n", " -0.06283857673406601,\n", " -0.002584814326837659,\n", " 0.022721314802765846,\n", " 0.030795304104685783,\n", " -0.03788352012634277,\n", " 0.016235800459980965,\n", " -0.06886465847492218,\n", " -0.038395028561353683,\n", " 0.015214100480079651,\n", " -0.004891248419880867,\n", " -0.02435111068189144,\n", " -0.033054452389478683,\n", " 0.009785245172679424,\n", " -0.04958034306764603,\n", " -0.03816225007176399,\n", " -0.05870545282959938,\n", " -0.018539980053901672,\n", " 0.008465869352221489,\n", " 0.003877372248098254,\n", " -0.09126294404268265,\n", " 0.04571693763136864,\n", " 0.00016745622269809246,\n", " -0.03706406056880951,\n", " -0.07265326380729675,\n", " 0.014030824415385723,\n", " 0.024624047800898552,\n", " 0.02048208937048912,\n", " -0.030132416635751724,\n", " -0.01138361543416977,\n", " 0.06956106424331665,\n", " -0.07716278731822968,\n", " 0.013762407936155796,\n", " 0.05541941151022911,\n", " -0.01262593362480402,\n", " 0.029387187212705612,\n", " 0.00174358778167516,\n", " 0.018247557803988457,\n", " -0.03690752387046814,\n", " -0.011039257049560547,\n", " 0.03380611538887024,\n", " 0.02804289385676384,\n", " 0.016133427619934082,\n", " 0.04428141191601753,\n", " 0.0032634520903229713,\n", " -0.0597325824201107,\n", " -0.008179830387234688,\n", " -0.05102973058819771,\n", " -0.0070448811165988445,\n", " 0.022384706884622574,\n", " -0.004133657552301884,\n", " -0.06725755333900452,\n", " -0.045094408094882965,\n", " -0.041803989559412,\n", " 0.00109894631896168,\n", " -0.03332516551017761,\n", " -0.039469774812459946,\n", " -0.0017720567993819714,\n", " -0.00361104728654027,\n", " -0.059613220393657684,\n", " 0.001255390583537519,\n", " -0.06421744078397751,\n", " 0.01338503509759903,\n", " -0.004155253060162067,\n", " 0.01147147361189127,\n", " -0.09563890844583511,\n", " 0.025757789611816406,\n", " -0.07686460018157959,\n", " -0.05221434310078621,\n", " 0.09785885363817215,\n", " 0.02696821466088295,\n", " 0.05433380976319313,\n", " -0.009211809374392033,\n", " 0.04775456711649895,\n", " 0.008341927081346512,\n", " 0.00861097127199173,\n", " -0.033195339143276215,\n", " -0.019128751009702682,\n", " 0.0020651966333389282,\n", " 0.04897334426641464,\n", " -0.029391389340162277,\n", " 0.07149279117584229,\n", " 0.028995925560593605,\n", " 0.03145631030201912,\n", " -0.027793746441602707,\n", " -0.0053723398596048355,\n", " -0.019301636144518852,\n", " 0.029584547504782677,\n", " -0.053522318601608276,\n", " -0.0004233938525430858,\n", " -0.026527035981416702,\n", " 0.015348592773079872,\n", " -0.04437144100666046,\n", " 0.02870798483490944,\n", " -0.0051371087320148945,\n", " 0.0335332453250885,\n", " -0.031830381602048874,\n", " -0.01568051055073738,\n", " -0.019446898251771927,\n", " 0.025255724787712097,\n", " 0.02709783986210823,\n", " 0.06026000902056694,\n", " 0.004336720798164606,\n", " 0.027517449110746384,\n", " -0.017239315435290337,\n", " -0.009114130400121212,\n", " 0.06143210828304291,\n", " -0.0005590535583905876,\n", " -0.010370689444243908,\n", " 0.01375425886362791,\n", " -0.012423936277627945,\n", " 0.030062643811106682,\n", " -0.007364360615611076,\n", " -0.08027222752571106,\n", " 0.0093068927526474,\n", " -0.016336064785718918,\n", " -0.07673550397157669,\n", " 0.009538617916405201,\n", " -0.008824382908642292,\n", " 0.06418278813362122,\n", " 0.003448690054938197,\n", " 0.039837419986724854,\n", " -0.02834644354879856,\n", " -0.002781536430120468,\n", " -0.008697469718754292,\n", " -0.09088705480098724,\n", " -0.02555425837635994,\n", " 0.016010088846087456,\n", " -0.10316161811351776,\n", " 0.029621800407767296,\n", " 0.046320658177137375,\n", " -6.857137562294363e-33,\n", " -0.021621961146593094,\n", " -0.04817189276218414,\n", " 0.03901229053735733,\n", " -0.004985841456800699,\n", " -0.06144525483250618,\n", " -0.01748894713819027,\n", " 0.01322949305176735,\n", " -0.013633494265377522,\n", " -0.0712985023856163,\n", " -0.039634332060813904,\n", " 0.015939820557832718,\n", " 0.029119368642568588,\n", " 0.002637868979945779,\n", " -0.0168831218034029,\n", " 0.038217607885599136,\n", " -0.0069684372283518314,\n", " -0.031115831807255745,\n", " -0.06426054239273071,\n", " 0.029238536953926086,\n", " 0.0021622050553560257,\n", " -0.028495190665125847,\n", " -0.007963829673826694,\n", " 0.04967361316084862,\n", " -0.06907685846090317,\n", " 0.06746205687522888,\n", " 0.006055401172488928,\n", " 0.014288611710071564,\n", " -0.0017710536485537887,\n", " -0.005548357963562012,\n", " 0.006638750433921814,\n", " -0.00446308683604002,\n", " 0.005751179531216621,\n", " -0.0012957042781636119,\n", " -0.06199806556105614,\n", " -0.0031614224426448345,\n", " 0.04475956782698631,\n", " 0.011398843489587307,\n", " -0.051305949687957764,\n", " -0.03810010105371475,\n", " 0.010065917856991291,\n", " -0.015065459534525871,\n", " -0.019240137189626694,\n", " 0.000947606167756021,\n", " 0.005156329367309809,\n", " -0.0343010313808918,\n", " -0.0683659240603447,\n", " 0.02917490154504776,\n", " -0.01754322461783886,\n", " 0.011155392043292522,\n", " 0.03554486483335495,\n", " 0.02509222738444805,\n", " 0.01401441264897585,\n", " -0.04801076650619507,\n", " 0.034625984728336334,\n", " -0.03269793465733528,\n", " -0.013363421894609928,\n", " 0.04138528183102608,\n", " 0.06476683169603348,\n", " -0.0839647650718689,\n", " 0.03774556890130043,\n", " -0.020632807165384293,\n", " -0.04145556315779686,\n", " 0.001201158156618476,\n", " -0.006138276774436235,\n", " 0.027654238045215607,\n", " -0.00046422448940575123,\n", " 0.010759720578789711,\n", " 0.0474126972258091,\n", " -0.05066175386309624,\n", " -0.06622277945280075,\n", " 0.02703370898962021,\n", " 0.07353314012289047,\n", " 0.009069688618183136,\n", " -0.03359762951731682,\n", " -0.0110212042927742,\n", " -0.06042158231139183,\n", " 0.0026825100649148226,\n", " 0.006839897483587265,\n", " 0.04937922582030296,\n", " 0.014604314230382442,\n", " -0.008908428251743317,\n", " -0.028015436604619026,\n", " -0.048934705555438995,\n", " 0.010803699493408203,\n", " -0.03307461366057396,\n", " -0.06739415228366852,\n", " -0.010912865400314331,\n", " 0.024999726563692093,\n", " -0.007953696884214878,\n", " -0.015193399973213673,\n", " 0.06553444266319275,\n", " 0.011256423778831959,\n", " -0.0042674052529037,\n", " -0.024528928101062775,\n", " -0.10101453959941864,\n", " 0.011371267028152943,\n", " 0.04396237060427666,\n", " -0.007721093017607927,\n", " -0.008544344455003738,\n", " -0.016210244968533516,\n", " -0.012711049988865852,\n", " -0.000701909651979804,\n", " -0.005274559371173382,\n", " 0.06531921774148941,\n", " -0.0068701161071658134,\n", " -0.023475592955946922,\n", " -0.05590588226914406,\n", " -0.018033042550086975,\n", " -0.04487154260277748,\n", " 0.0025067764800041914,\n", " 0.030465416610240936,\n", " -0.04786740988492966,\n", " 0.05252247676253319,\n", " -0.007882999256253242,\n", " -0.0013190893223509192,\n", " -0.006703077349811792,\n", " 0.044160548597574234,\n", " 0.017661869525909424,\n", " 0.021562816575169563,\n", " 0.07617024332284927,\n", " 0.01589404232800007,\n", " 0.04167177528142929,\n", " -0.07404530793428421,\n", " 0.036653485149145126,\n", " 0.022879356518387794,\n", " 0.03443761169910431,\n", " 0.016310250386595726,\n", " 0.0664006918668747,\n", " 0.039904676377773285,\n", " -0.023147499188780785,\n", " 0.014191867783665657,\n", " -0.05034564435482025,\n", " 2.813728201545018e-07,\n", " -0.0014307472156360745,\n", " 0.03852580487728119,\n", " -0.008815601468086243,\n", " 0.014900598675012589,\n", " 0.053684670478105545,\n", " -0.0030945059843361378,\n", " -0.0010258249239996076,\n", " 0.07511196285486221,\n", " -0.03321100398898125,\n", " -0.002297262428328395,\n", " 0.05685383826494217,\n", " -0.01852419413626194,\n", " 0.015919240191578865,\n", " -0.010373836383223534,\n", " 0.036995530128479004,\n", " -0.011069325730204582,\n", " -0.057145752012729645,\n", " -0.0070942845195531845,\n", " -0.0010367470094934106,\n", " 0.013445760123431683,\n", " 0.022497881203889847,\n", " -0.026446551084518433,\n", " 0.010870615020394325,\n", " 0.036760736256837845,\n", " -0.047007717192173004,\n", " -0.0026127346791327,\n", " -0.008976787328720093,\n", " -0.02328304946422577,\n", " 0.03197775408625603,\n", " 0.01989900693297386,\n", " -0.041343893855810165,\n", " -0.014990942552685738,\n", " 0.024322882294654846,\n", " -0.016868771985173225,\n", " 0.02178443968296051,\n", " -0.011080385185778141,\n", " 0.020172378048300743,\n", " 0.06485244631767273,\n", " -0.0027855837251991034,\n", " 0.11754929274320602,\n", " 0.012095569632947445,\n", " 0.0075214821845293045,\n", " -0.0046376571990549564,\n", " 0.026846403256058693,\n", " 0.03498966619372368,\n", " 0.027564022690057755,\n", " -0.009943321347236633,\n", " -0.018119869753718376,\n", " -0.031100401654839516,\n", " 0.025304147973656654,\n", " 0.07492371648550034,\n", " -0.01309086475521326,\n", " -0.048150286078453064,\n", " -0.004847416654229164,\n", " 0.014715524390339851,\n", " 0.008357856422662735,\n", " 0.052254095673561096,\n", " 0.003934715874493122,\n", " 0.03689158335328102,\n", " 0.01333687361329794,\n", " -0.015352764166891575,\n", " -0.005101861897855997,\n", " -0.019899796694517136,\n", " -0.0077162571251392365,\n", " 0.02962283231317997,\n", " 0.04794803634285927,\n", " -0.030213722959160805,\n", " 2.8779818749836593e-34,\n", " 0.00043316834489814937,\n", " -0.038739610463380814,\n", " -0.05394865199923515,\n", " -0.019675668329000473,\n", " 0.01223861612379551,\n", " -0.001877678674645722,\n", " 0.04158131033182144,\n", " -0.028788113966584206,\n", " -0.00927739217877388,\n", " -0.01771310530602932,\n", " 0.014376083388924599]}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 18 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:26.186860Z", "start_time": "2025-05-24T20:01:26.184093Z" } }, "cell_type": "code", "source": "len(docs[1]['embedding'])", "id": "f38edb7db76bad31", "outputs": [ { "data": { "text/plain": [ "768" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 19 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:27.750720Z", "start_time": "2025-05-24T20:01:26.234529Z" } }, "cell_type": "code", "source": [ "# upload the documents to the vector database\n", "try:\n", " response = (\n", " supabase.table(\"documents\")\n", " .insert(docs)\n", " .execute()\n", " )\n", "except Exception as exception:\n", " print(\"Error inserting data into Supabase:\", exception)\n", "\n", "# ALTERNATIVE : Save the documents (a list of dict) into a csv file, and manually upload it to Supabase\n", "# import pandas as pd\n", "# df = pd.DataFrame(docs)\n", "# df.to_csv('supabase_docs.csv', index=False)" ], "id": "e274a83872ca484f", "outputs": [], "execution_count": 20 }, { "cell_type": "code", "id": "77fb9dbb", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:29.314814Z", "start_time": "2025-05-24T20:01:29.312121Z" } }, "source": [ "# add items to vector database\n", "vector_store = SupabaseVectorStore(\n", " client=supabase,\n", " embedding= embeddings,\n", " table_name=\"documents\",\n", " query_name=\"match_documents_langchain\",\n", ")\n", "retriever = vector_store.as_retriever()" ], "outputs": [], "execution_count": 21 }, { "cell_type": "code", "id": "12a05971", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:30.530131Z", "start_time": "2025-05-24T20:01:30.322872Z" } }, "source": [ "query = \"On June 6, 2023, an article by Carolyn Collins Petersen was published in Universe Today. This article mentions a team that produced a paper about their observations, linked at the bottom of the article. Find this paper. Under what NASA award number was the work performed by R. G. Arendt supported by?\"\n", "matched_docs = vector_store.similarity_search(query, 2)" ], "outputs": [], "execution_count": 22 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:33.484310Z", "start_time": "2025-05-24T20:01:33.266356Z" } }, "cell_type": "code", "source": "docs = retriever.invoke(query)", "id": "deb58a8cee3954a2", "outputs": [], "execution_count": 23 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:01:34.633646Z", "start_time": "2025-05-24T20:01:34.631042Z" } }, "cell_type": "code", "source": "docs[0]", "id": "e6f3da34e4ed8543", "outputs": [ { "data": { "text/plain": [ "Document(metadata={'source': '840bfca7-4f7b-481a-8794-c560c340185d'}, page_content='Question : On June 6, 2023, an article by Carolyn Collins Petersen was published in Universe Today. This article mentions a team that produced a paper about their observations, linked at the bottom of the article. Find this paper. Under what NASA award number was the work performed by R. G. Arendt supported by?\\n\\nFinal answer : 80GSFC21M0002')" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 24 }, { "cell_type": "code", "id": "1eae5ba4", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:04:10.311683Z", "start_time": "2025-05-24T20:04:10.307171Z" } }, "source": [ "# list of the tools used in all the samples\n", "from collections import Counter, OrderedDict\n", "\n", "tools = []\n", "for sample in json_QA:\n", " for tool in sample['Annotator Metadata']['Tools'].split('\\n'):\n", " tool = tool[2:].strip().lower()\n", " if tool.startswith(\"(\"):\n", " tool = tool[11:].strip()\n", " tools.append(tool)\n", "tools_counter = OrderedDict(Counter(tools))\n", "print(\"List of tools used in all samples:\")\n", "print(\"Total number of tools used:\", len(tools_counter))\n", "for tool, count in tools_counter.items():\n", " print(f\" ├── {tool}: {count}\")" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "List of tools used in all samples:\n", "Total number of tools used: 83\n", " ├── web browser: 107\n", " ├── image recognition tools (to identify and parse a figure with three axes): 1\n", " ├── search engine: 101\n", " ├── calculator: 34\n", " ├── unlambda compiler (optional): 1\n", " ├── a web browser.: 2\n", " ├── a search engine.: 2\n", " ├── a calculator.: 1\n", " ├── microsoft excel: 5\n", " ├── google search: 1\n", " ├── ne: 9\n", " ├── pdf access: 7\n", " ├── file handling: 2\n", " ├── python: 3\n", " ├── image recognition tools: 12\n", " ├── jsonld file access: 1\n", " ├── video parsing: 1\n", " ├── python compiler: 1\n", " ├── video recognition tools: 3\n", " ├── pdf viewer: 7\n", " ├── microsoft excel / google sheets: 3\n", " ├── word document access: 1\n", " ├── tool to extract text from images: 1\n", " ├── a word reversal tool / script: 1\n", " ├── counter: 1\n", " ├── excel: 3\n", " ├── image recognition: 5\n", " ├── color recognition: 3\n", " ├── excel file access: 3\n", " ├── xml file access: 1\n", " ├── access to the internet archive, web.archive.org: 1\n", " ├── text processing/diff tool: 1\n", " ├── gif parsing tools: 1\n", " ├── a web browser: 7\n", " ├── a search engine: 7\n", " ├── a speech-to-text tool: 2\n", " ├── code/data analysis tools: 1\n", " ├── audio capability: 2\n", " ├── pdf reader: 1\n", " ├── markdown: 1\n", " ├── a calculator: 5\n", " ├── access to wikipedia: 3\n", " ├── image recognition/ocr: 3\n", " ├── google translate access: 1\n", " ├── ocr: 4\n", " ├── bass note data: 1\n", " ├── text editor: 1\n", " ├── xlsx file access: 1\n", " ├── powerpoint viewer: 1\n", " ├── csv file access: 1\n", " ├── calculator (or use excel): 1\n", " ├── computer algebra system: 1\n", " ├── video processing software: 1\n", " ├── audio processing software: 1\n", " ├── computer vision: 1\n", " ├── google maps: 1\n", " ├── access to excel files: 1\n", " ├── calculator (or ability to count): 1\n", " ├── a file interface: 3\n", " ├── a python ide: 1\n", " ├── spreadsheet editor: 1\n", " ├── tools required: 1\n", " ├── b browser: 1\n", " ├── image recognition and processing tools: 1\n", " ├── computer vision or ocr: 1\n", " ├── c++ compiler: 1\n", " ├── access to google maps: 1\n", " ├── youtube player: 1\n", " ├── natural language processor: 1\n", " ├── graph interaction tools: 1\n", " ├── bablyonian cuniform -> arabic legend: 1\n", " ├── access to youtube: 1\n", " ├── image search tools: 1\n", " ├── calculator or counting function: 1\n", " ├── a speech-to-text audio processing tool: 1\n", " ├── access to academic journal websites: 1\n", " ├── pdf reader/extracter: 1\n", " ├── rubik's cube model: 1\n", " ├── wikipedia: 1\n", " ├── video capability: 1\n", " ├── image processing tools: 1\n", " ├── age recognition software: 1\n", " ├── youtube: 1\n" ] } ], "execution_count": 25 }, { "cell_type": "code", "id": "7fe573cc", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:04:10.960894Z", "start_time": "2025-05-24T20:04:10.958472Z" } }, "source": [ "# system_prompt = \"\"\"\n", "# You are a helpful assistant tasked with answering questions using a set of tools.\n", "# If the tool is not available, you can try to find the information online. You can also use your own knowledge to answer the question.\n", "# You need to provide a step-by-step explanation of how you arrived at the answer.\n", "# ==========================\n", "# Here is a few examples showing you how to answer the question step by step.\n", "# \"\"\"\n", "# for i, samples in enumerate(random_samples):\n", "# system_prompt += f\"\\nQuestion {i+1}: {samples['Question']}\\nSteps:\\n{samples['Annotator Metadata']['Steps']}\\nTools:\\n{samples['Annotator Metadata']['Tools']}\\nFinal Answer: {samples['Final answer']}\\n\"\n", "# system_prompt += \"\\n==========================\\n\"\n", "# system_prompt += \"Now, please answer the following question step by step.\\n\"\n", "\n", "# save the system_prompt to a file\n", "# with open('system_prompt.txt', 'w') as f:\n", "# f.write(system_prompt)" ], "outputs": [], "execution_count": 26 }, { "cell_type": "code", "id": "d6beb0da", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T20:04:11.434032Z", "start_time": "2025-05-24T20:04:11.430890Z" } }, "source": [ "# load the system prompt from the file\n", "with open('system_prompt.txt', 'r') as f:\n", " system_prompt = f.read()\n", "\n", "print(system_prompt)" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "You are a helpful assistant tasked with answering questions using a set of tools.\n", "Now, I will ask you a question. Report your thoughts, and finish your answer with the following template:\n", "FINAL ANSWER: [YOUR FINAL ANSWER].\n", "YOUR FINAL ANSWER should be a number OR as few words as possible OR a comma separated list of numbers and/or strings. If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise. If you are asked for a string, don't use articles, neither abbreviations (e.g. for cities), and write the digits in plain text unless specified otherwise. If you are asked for a comma separated list, apply the above rules depending of whether the element to be put in the list is a number or a string.\n", "Your answer should only start with \"FINAL ANSWER: \", then follows with the answer.\n", "\n" ] } ], "execution_count": 27 }, { "cell_type": "code", "id": "42fde0f8", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:32:48.903802Z", "start_time": "2025-05-24T21:32:46.457278Z" } }, "source": [ "import dotenv\n", "from langgraph.graph import MessagesState, START, StateGraph\n", "from langgraph.prebuilt import tools_condition\n", "from langgraph.prebuilt import ToolNode\n", "from langchain_google_genai import ChatGoogleGenerativeAI\n", "from langchain_huggingface import HuggingFaceEmbeddings\n", "from langchain_community.tools.tavily_search import TavilySearchResults\n", "from langchain_community.document_loaders import WikipediaLoader\n", "from langchain_community.document_loaders import ArxivLoader\n", "from langchain_community.vectorstores import SupabaseVectorStore\n", "from langchain.tools.retriever import create_retriever_tool\n", "from langchain_core.messages import HumanMessage, SystemMessage\n", "from langchain_core.tools import tool\n", "from supabase.client import Client, create_client\n", "\n", "# Define the retriever from supabase\n", "load_dotenv()\n", "embeddings = HuggingFaceEmbeddings(model_name=\"sentence-transformers/all-mpnet-base-v2\") # dim=768\n", "\n", "supabase_url = os.environ.get(\"SUPABASE_URL\")\n", "supabase_key = os.environ.get(\"SUPABASE_SERVICE_KEY\")\n", "supabase: Client = create_client(supabase_url, supabase_key)\n", "vector_store = SupabaseVectorStore(\n", " client=supabase,\n", " embedding= embeddings,\n", " table_name=\"documents\",\n", " query_name=\"match_documents_langchain\",\n", ")\n", "\n", "question_retrieve_tool = create_retriever_tool(\n", " vector_store.as_retriever(),\n", " \"question_retriever\",\n", " \"Find similar questions in the vector database for the given question.\",\n", ")\n", "\n", "@tool\n", "def multiply(a: int, b: int) -> int:\n", " \"\"\"Multiply two numbers.\n", "\n", " Args:\n", " a: first int\n", " b: second int\n", " \"\"\"\n", " return a * b\n", "\n", "@tool\n", "def add(a: int, b: int) -> int:\n", " \"\"\"Add two numbers.\n", " \n", " Args:\n", " a: first int\n", " b: second int\n", " \"\"\"\n", " return a + b\n", "\n", "@tool\n", "def subtract(a: int, b: int) -> int:\n", " \"\"\"Subtract two numbers.\n", " \n", " Args:\n", " a: first int\n", " b: second int\n", " \"\"\"\n", " return a - b\n", "\n", "@tool\n", "def divide(a: int, b: int) -> float:\n", " \"\"\"Divide two numbers.\n", " \n", " Args:\n", " a: first int\n", " b: second int\n", " \"\"\"\n", " if b == 0:\n", " raise ValueError(\"Cannot divide by zero.\")\n", " return a / b\n", "\n", "@tool\n", "def modulus(a: int, b: int) -> int:\n", " \"\"\"Get the modulus of two numbers.\n", " \n", " Args:\n", " a: first int\n", " b: second int\n", " \"\"\"\n", " return a % b\n", "\n", "@tool\n", "def wiki_search(query: str) -> dict:\n", " \"\"\"Search Wikipedia for a query and return maximum 2 results.\n", " \n", " Args:\n", " query: The search query.\"\"\"\n", " search_docs = WikipediaLoader(query=query, load_max_docs=2).load()\n", " formatted_search_docs = \"\\n\\n---\\n\\n\".join(\n", " [\n", " f'\\n{doc.page_content}\\n'\n", " for doc in search_docs\n", " ])\n", " return {\"wiki_results\": formatted_search_docs}\n", "\n", "@tool\n", "def web_search(query: str) -> dict:\n", " \"\"\"Search Tavily for a query and return maximum 3 results,\n", " formatted with source URL, title, and content.\n", "\n", " Args:\n", " query: The search query.\n", " \"\"\"\n", "\n", " tavily_tool = TavilySearchResults(max_results=3)\n", "\n", " # 'search_docs' is expected to be a list of dictionaries based on your sample.\n", " # Each dictionary contains keys like 'url', 'content', 'title'.\n", " search_docs = tavily_tool.invoke(query)\n", "\n", " final_formatted_docs = []\n", "\n", " if isinstance(search_docs, list):\n", " for doc_dict in search_docs: # Iterate through the list of result dictionaries\n", " if isinstance(doc_dict, dict):\n", " # Extract data using dictionary keys found in your sample:\n", " source_url = doc_dict.get(\"url\", \"N/A\") # From your sample, e.g., 'https://www.biblegateway.com/...'\n", " page_content = doc_dict.get(\"content\", \"\") # From your sample, e.g., '8\\xa0When the king’s order...'\n", " title = doc_dict.get(\"title\", \"No Title Provided\") # From your sample, e.g., 'Esther 1-10 NIV...'\n", "\n", " # Format the output string including source, title, and content\n", " final_formatted_docs.append(\n", " f'\\n{page_content}\\n'\n", " )\n", " else:\n", " # This handles cases where an item in the list returned by Tavily might not be a dictionary.\n", " print(f\"[web_search_DEBUG] Expected a dictionary in search_docs list, but got {type(doc_dict)}: {str(doc_dict)[:100]}\")\n", " elif isinstance(search_docs, str):\n", " # This handles cases where the Tavily tool might return a single string (e.g., an error message)\n", " print(f\"[web_search_DEBUG] Tavily search returned a string, possibly an error: {search_docs}\")\n", " final_formatted_docs.append(\n", " f'\\n{search_docs}\\n'\n", " )\n", " else:\n", " # This handles any other unexpected types for search_docs\n", " print(f\"[web_search_DEBUG] Expected search_docs to be a list or string, but got {type(search_docs)}. Output may be empty.\")\n", "\n", " joined_formatted_docs = \"\\n\\n---\\n\\n\".join(final_formatted_docs)\n", "\n", " return {\"web_results\": joined_formatted_docs}\n", "\n", "@tool\n", "def arvix_search(query: str) -> dict:\n", " \"\"\"Search Arxiv for a query and return maximum 3 result.\n", "\n", " Args:\n", " query: The search query.\"\"\"\n", " search_docs = ArxivLoader(query=query, load_max_docs=3).load()\n", "\n", " # print(f\"[arvix_search_DEBUG] ArxivLoader found {len(search_docs)} documents.\")\n", "\n", " processed_docs_str_list = []\n", " for i, doc in enumerate(search_docs):\n", " # print(f\"\\n--- [arvix_search_DEBUG] Document {i+1} ---\")\n", " # print(f\"Metadata: {doc.metadata}\")\n", " # print(f\"Page Content (first 200 chars): {doc.page_content[:200]}...\")\n", " # print(f\"--- End Debug for Document {i+1} ---\\n\")\n", "\n", " # Your original logic to format the document (with the fix for 'source')\n", " title = doc.metadata.get(\"Title\", \"N/A\")\n", " published = doc.metadata.get(\"Published\", \"N/A\") # 'page' might often be empty for ArxivLoader results\n", " # content_snippet = doc.page_content[:3000]\n", " content_snippet = doc.page_content\n", "\n", " formatted_doc_str = f'\\n{content_snippet}\\n'\n", " processed_docs_str_list.append(formatted_doc_str)\n", "\n", " formatted_search_results = \"\\n\\n---\\n\\n\".join(processed_docs_str_list)\n", "\n", " # print(f\"[arvix_search_DEBUG] Returning: {{\\\"arvix_results\\\": \\\"{formatted_search_results[:100]}...\\\"}}\")\n", "\n", " return {\"arvix_results\": formatted_search_results}\n", "\n", "@tool\n", "def similar_question_search(question: str) -> dict:\n", " \"\"\"Search the vector database for similar questions and return the first results.\n", " \n", " Args:\n", " question: the question human provided.\"\"\"\n", " matched_docs = vector_store.similarity_search(query, 3)\n", " formatted_search_docs = \"\\n\\n---\\n\\n\".join(\n", " [\n", " f'\\n{doc.page_content[:1000]}\\n'\n", " for doc in matched_docs\n", " ])\n", " return {\"similar_questions\": formatted_search_docs}\n", "\n", "tools = [\n", " multiply,\n", " add,\n", " subtract,\n", " divide,\n", " modulus,\n", " wiki_search,\n", " web_search,\n", " arvix_search,\n", " similar_question_search,\n", "]\n", "\n", "# llm = ChatGoogleGenerativeAI(model=\"gemini-2.0-flash\")\n", "llm = ChatGoogleGenerativeAI(model=\"gemini-2.5-flash-preview-04-17\")\n", "\n", "llm_with_tools = llm.bind_tools(tools)" ], "outputs": [], "execution_count": 36 }, { "cell_type": "code", "id": "7dd0716c", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:32:53.945450Z", "start_time": "2025-05-24T21:32:53.937949Z" } }, "source": [ "# load the system prompt from the file\n", "with open('system_prompt.txt', 'r') as f:\n", " system_prompt = f.read()\n", "\n", "\n", "# System message\n", "sys_msg = SystemMessage(content=system_prompt)\n", "\n", "# Node\n", "def assistant(state: MessagesState):\n", " \"\"\"Assistant node\"\"\"\n", " return {\"messages\": [llm_with_tools.invoke([sys_msg] + state[\"messages\"])]}\n", "\n", "# Build graph\n", "builder = StateGraph(MessagesState)\n", "builder.add_node(\"assistant\", assistant)\n", "builder.add_node(\"tools\", ToolNode(tools))\n", "builder.add_edge(START, \"assistant\")\n", "builder.add_conditional_edges(\n", " \"assistant\",\n", " # If the latest message (result) from assistant is a tool call -> tools_condition routes to tools\n", " # If the latest message (result) from assistant is a not a tool call -> tools_condition routes to END\n", " tools_condition,\n", ")\n", "builder.add_edge(\"tools\", \"assistant\")\n", "\n", "# Compile graph\n", "graph = builder.compile()\n" ], "outputs": [], "execution_count": 37 }, { "cell_type": "code", "id": "f4e77216", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:32:54.644928Z", "start_time": "2025-05-24T21:32:54.547513Z" } }, "source": [ "from IPython.display import Image, display\n", "\n", "display(Image(graph.get_graph(xray=True).draw_mermaid_png()))" ], "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAANgAAAD5CAIAAADKsmwpAAAQAElEQVR4nOzdCVhU5f4H8Hd2ZmGAGfZNRQQEXAMtNTU1TdNccknNNP8u6a2kMm9mllqZde3q1UwzNXPfcNcyV1xRUVEBEQQldhi22Zh9/j+cG3FpICzO8J457+fhmWc458CwfOddz3kP12q1IoJoaVxEEBggQSSwQIJIYIEEkcACCSKBBRJEAgskiPUZdGZFvkGrMmtVJrPJajTQYHhLIGRz+SyRK1fkyvYJFiIaYpFxRBut2pR5Q52doikv0rt780WuHPi/SmVco54Gfx+eC7uiCN48Johjzj1tSLQkpKO4bUcJog8SRAR/gctHyooeVXsFuYREiwPbiRCdGXSW7BR17v3q/AfVPYbJw7q6IjpgehDvXVWe3lUC/7Cu/TyQc1FVGOENBsXkwEm+YinubTBGB/H8/lIOD/Uc5oWcV3mx/uCaggETfIIjsC7pmRvEs3tLZD78Tr3dEQMcWpf/9BC5T7ALwhVDg3hkfUFQuKhzH0ak0ObQ2vyIWGl4DKZNRjZinstHFP5thYxKIRg+K+DmmQpFgR5hiXFBzLylgsen+jtb16Qpxs8Lhmax1YJjHci4ICbEl3Z5jokptAnpILl4SIHww6wg3jpXEREjFUo4iKmgQZJ5S61RmhBmmBXER6maZ4bJELP1HuWZnFCJMMOgID5K03B5bA6Hif2zuoIjxCmXqhBmGPRfeXhX06aDGDnWBx98cOjQIfTknn/++fz8fEQBvgvbK1AAE4AIJwwKYnmJoa3Dg5iWloaeXGFhYUVFBaJMWBdJ3gMtwglTgmjQWRT5eqGEqinXS5cuzZw5s1evXiNGjPjkk08UipqeaUxMTEFBwaefftq3b1/4VK1Wr1u3bvLkybbDVqxYodPpbF/ev3//nTt3Tp8+Hb4kISFh2LBhsHH48OHvvfceooDYjVeah9eAIlOCCP1E6ib+09PT58yZExsbu2/fvnnz5mVkZCxatAg9Tic8Lly48Ny5c/Bk165dmzdvnjRp0sqVK+H4kydPrl+/3vYdeDzegQMHwsPD16xZ07NnTzgANkKd/vXXXyMKiKUcjdKMcMKUE2M1VSaxG1W/bHJysouLy9SpU9lstq+vb2Rk5IMHD/542KuvvgolX5s2bWyf3r59+/Lly2+//TY8Z7FYbm5uc+fORQ4Bfwr4gyCcMCWIFgviC6kq/jt37gyVbFxcXPfu3Xv37h0UFAQ17B8Pg2LvypUrUHFDkWky1eRAJvt9LAniixyFzWVBlwXhhClVM1RGVaVGRI2IiIhVq1Z5eXmtXr165MiRs2fPhtLuj4fBXqiL4YCDBw8mJSW9/vrrdffy+XzkKJpKE4fLQjhhShBFUq6WyumEHj16QFvwyJEj0DqsqqqC0tFW5tWyWq3x8fHjxo2DIEL1DVtUKhVqIZS2mP8apgRRKOZ4BghMRguiwI0bN6C1B0+gUBw6dCh0dSFkMART9xij0VhdXe3t7W371GAwnD9/HrUQvdbiHSRAOGHQOCJMMWff1SAKQEUMneX9+/fD4F9KSgr0jiGRfn5+AoEAkpeYmAgVMfRjWrduffjw4by8vMrKyiVLlkDLUqlUajR2fiQ4Eh6hWw3fDVEg46bKpxVeJ8kyKIhtosUPUygJInSHocJdvnw5TIfMmDFDLBZDW5DLran7oCt9/fp1KCOhOFy6dCl0rkePHg2DiN26dXvzzTfh0wEDBsBYY71vGBgYCEOJMOgIzUpEgUdp2jZRjh7bbxyDztA26C3HNhaOnB2AmO3X+9rsu+q+o70RThhUIvIFbO9Awc0zFE6d0cLlw4qoZ9wQZpi10kOPofI1c7MaunLUYrH069fP7i7oW8AoIAw7/3FXSEjIpk2bEDVgqBw64OgJf6SwsLDaOZt6oHXo4cP3CsCrp4IYePHU7fOVFou1S1/7WWxoSEWv10PPw+4uiIJEQuGaCn/hR4KOEbRT7e46trHg2ZFeUhkPYYaJV/Ed31QYHuNKrxU5mgXOvzgTzxIdMtXvytGyklwdYpKE+FK5Hx/btx9Dr2uumef4T97TL8rpvtJNE0EKvYMF7WOlCFcMPW8eGnaj44Ku/1KRmojdSfPNC95yh9bmS2VcnFOIyCJMV44pHqZqoTfdOhKvAd5mkXSyPDVR+dxY7+Bw3At+siwdKivQXz5aJhCyA9oJYb5B5Er7Ia3SPH3OPc2N0xUdn3XvPljGZuN1oo1dJIj/lZ9Vff+66mGqxsOHJ/Phi924YilX7MYx43Uis30sllVVbtIozVaLNeOm2kXMDu0kgRTidtJhI0gQ6yt6VF2ab9BUwf/VBGWJVtWcSYQZ5+zs7KioKNSsJB5cZK0559LVg+vfVujqgd0w4Z8iQXSorKys+fPn79mzBxH/iyzmTmCBBJHAAgkigQUSRAILJIgEFkgQCSyQIBJYIEEksECCSGCBBJHAAgkigQUSRAILJIgEFkgQCSyQIBJYIEEksECCSGCBBJHAAgkigQUSRAILJIgEFkgQCSyQIBJYIEF0KBaLVXuHC6IuEkSHslqtJSUliPgDEkQCCySIBBZIEAkskCASWCBBJLBAgkhggQSRwAIJIoEFEkQCCySIBBZIEAkskCASWCBBJLBAgkhggQSRwAK54Y8jvPLKK1qtFp4YDIaysjI/Pz/0+Bb0J06cQMRjDL1NroMNHz68qKiooKBAoVDAO7/gMVdXV0T8hgTREaBEDA4OrruFxWL16tULEb8hQXQEiN2oUaM4HE7tllatWo0bNw4RvyFBdJCxY8cGBQXZnkMu+/TpY2spEjYkiA7C5XKhghYIBPA8MDBw9OjRiKiDBNFxoHaGCMKTHj16kOKwHsaNI1arzWUFMIpiQS1hWP9pJy0n+3Ybl52iQS3AKnHnynz4XB52BRCDxhFNBssv24rzs6oDw8RGXcsEsWXx+OzKUoPZZAl7yrXbIBnCCVOCqK82x6/Kjx3s6dtKhBgv6RcFh4t6j/RE2GBKG3H38ty+Y/1ICm1iBnparazLR8sQNhgRxJTLVSGdXF1lPET8pmt/eUF2tVppQnhgRBCLcnQiKUlhfTCcWVFkQHhgRK/ZoLNI5SSI9cn8BJpKM8IDI4Ko01isTOwl/wl4f5otuHRVyfmIBBZIEAkskCASWCBBJLBAgkhggQSRwAIJIoEFEkQCCySIBBZIEAkskCASWCDXrFArO/vBc/1j7ty5hYhGkSBSy93d47VJ07y9fRs55uHDrFcmDEV/z8iXny8ozEe0Rapmaslk8tenvNH4Mfcz0tDfU1RUWFlZgeiMBNG+K1cunDl74s7dW0plVfuI6EmTpnXpHGPblXj10u7dW9Lvp8pkntHRnWZMe0su92xoO1TN/zf9lf+s+L5jxy4qteqHzeuuJl6sqCwPD4scMGDwi0NGwJYtWzfAl0MNPnvWO2NGT2zopQ8c3LN124aV/17/yeJ5jx5lh4SEwsEvDBp2Kznp3fdqsj7x1eETxk+ZPu1NREOkarZDp9N9/sVHer3+g38uXvr5yuDg1gs+eqe8vOYKj4zM9PkfzunSJXbzpn1vvzUvKyvjy68WNbK9rq++WpyWeicubj4c07599IqVX6Sm3oHy8pVxr/n4+J49nQTBauSleTyeWq1atfqr999beObU9T69B3z1ryXFxUUQ0y8+XwkHbN92iKYpRKREtMvFxWXD+l1CodDNzR0+hWLp0OF9d1OS+/Tun3I3Gfa+OnEqm82G9ESER2Y/fADHNLS9rtt3bkLmYmOehuczpr/Vp88AN6l7018aPjUajZNfmxEZ2QGeDxo4FErTBw/uw8sh+iNBtE+r1WzY+E3y7RtlZQrbFlsjLLpDZyi05i+Ii3mq+zPP9A4MCLLVmw1tr6tDh8579m6rqqrs1LFrbOwz4WHtn+ilbSIiomxPXF2l8AhlJHIKpGq2A+q7Oe9Mg+Jn4YKlv/x85eSJxNpdYe0iln2xylPutf771ZNeGzn3/dkpKbcb2V7XP+ctGv3yhOtJVxYsfHfUy89v+mGtyWRq+kvbsFgs5IxIiWjHuYSTBoMBWmlQRaL/LZBA92494APadjduXI3fv/PDBXH7409yuVy72+t+odRVCnX3xAmvQ0YvXDy7ddtGicR17JhXm/7STowE0Q7orkLFZ4sCSDh/unZXcvINvUEPgfP09Bo0aKivr3/cuzOKigsVpSV2t9d+YZWy6vTpn4cMHg6tQKij4QOad9DFafpLOzdSNdsREtIO2meHj8RD1Xn12uWbN69B16GkpAh2paTeXrR43pGj+6GsSruXsv/ALkier49fQ9trvyeXw/1xy/pFS/4JxSH0gn/55Vjmg/QO0Z1RzSp1wfByFy+ey83NaeSlGxEU3Boez507mZPzENETZ9GiRcjZ3bum8mkllLg39dLmkDahFot5X/yO79avqqqqeO/dBdXV2t17tpaXK6DmVamU27Zv3LFz86lTx8PC2r///scwfQJ9CLvbKyrKDx/ZN/iFl4KCgiPbd4Cad/uOH6DLkl+Q+9qk6TCOCG0+uczz/v20Hbs2S6Xuo0aOa+il5XIvGGKEeRromKPHPegdO3/o1bNvaGgYVPrFxYWQfmhCQqncxF8zN0MjlXG9AwUIA4xYhGn/N/kdnpX5thYioo7LR0oCQ12inpYiDJA2IoEFEkQCCySIBBZIEAkskCASWCBBJLBAgkhggQSRwAIJIoEFEkQCCySIBBZIEAkskCASWGBEEN08uYgxtxxsOoELmy/A5cIDRpwYKxRzSvP1iPhf+Q+0Mh8+wgMjgtgqSlxZisstljCh05qFEo7cH4uzYhFDghgQIpR5cxOPliDiN6e2FfQagdHdSRl0v+akUxUluXr/tiLPABcOl4kX67BYVlWlSaUwXP1J8crcIA9s6mXEqCCCR/c0GTfU1RpzZZ2bIeoNBjabzeM6ot9msVqNRqOAT1UCNFoti8XicDjs39TtjPBFHOid+IW4dBso4/LxeisyK4j1mM3mBw8enDt3bubMmcghsrKy5s+fv2fPHkQN+OYnTpyALHp4eEgkEoFA4O/vHxYWNmvWLIQ35gZxy5YtL774olgsdnFxQY6iUqlu3LjRt29fRI309PS4uDiFQlF3o8Vi8fPzO3bsGMIYQ69rjo+Pr6iokMvljkwhqlmwxpW6FKKalXEi2revv6QOvNkwTyFiYBDPnDkDjz179pwzZw5yuNLS0m+//RZRacKECVAv134KzcQLFy4g7DEriMuWLcvOzoYnvr4ts5SbUqmEJimiUmxsbNu2bW0tLqiUQ0JCDh06hLDHiJUeAHRKZDIZVFLQLkQth8fjBQYGtm7dGlFJJBJdu3ZNr9fDa0EjBPpGly5devbZZxHGGNFZgb5k//79BwwYgBhj4sSJxcXFp06dsn0KcTxw4MC2bdsQrpw8iGq1urKyMi0tbeDAgQgD0Ebcu3fv7NmzkcPdu3dv0qRJP/74Y1RUFMKPM7cRP/30UxjIgOoJkxQi5KkU6gAAD0FJREFUh7QRGwK96aSkpC+//HLfvn0IP04bRKiMOnToQHVr7El5e3u3SHFYC0ZPMzMzFy9ejDDjhFXz+vXrZ8yYYTAY+HyM5lKxcvjw4e3bt2/duhWfP5GzlYgff/yxu3vNevx4ptAB44hN8dJLL33++ed9+vRJTk5GeHCeICYkJMDj22+/PXbsWISrFmwj1hMaGnrlypXVq1fv2LEDYcBJggijFbbl9j09MTrH7o9avI1Yz8aNGwsLCz/66CPU0mjfRszLy4P/LsyXwDQrIv6Sn3766fvvv4cmIwz4oxZC4xLRZDJNnz5dp9NBc5AuKcSkjVjP4MGDV6xYAY/Xr19HLYSuQYSCHKatZs2aBW0dRB/4tBHradWq1fnz56GmhhFv1BLoF0SYyH/nnXcgiNDp69q1K6IV3NqI9axbt66qqmrevHnI4ejXRvzkk09g4rh3796IoMbp06dXrlwJTUbbQJhj0CmIUGtMnjwZ0VkLzjU/kYKCApiYXrJkSc+ePZFD0KZqfuGFF6KjoxHNYdtGrMff3x/Kxd27d2/YsAE5BA1KxJs3b0JbEHrHDj6tnwpUX7PS7NauXZuRkQF9akQxrEtEjUYzaNAgqbTm1khOkEJE/TUrzQ7GJUaOHAn/hZISapcnwLdEVKvVMOjv4eGB+WTJE6FLG7EehUIBTcZly5Z16tQJUQPTEnH//v1QI7dr186ZUogel+u3bt1CdAP/BZh9WbNmTX5+PqIGpsvSZWZmGo1G5HSgaoaZlerqapgZp11jA4oG6MQgamBaIr7xxhtDhw5FzojH4wmFQuiQQsMD0Ud6enp4eLjtzBIqYBpENze3FpyAdwAYEI2Li0P0ce/evT9eut+MMA3id999d/ToUeTUoFCEx9zcXEQHaWlpkZGRiDKYBhFmPGHsBjFAQkICjCwi7FFdImI6fANB5HK5zl071/rss89wODW1cTExMUlJSYgypI3Y8mwpTExMRLiCepnS4hCRNiI+8vLyTpw4gbBEdb2MSBsRH6NHj1YqlQhLVPdUELZBnDlzprOOIzZizJgx8Lhz506EGeaWiIxqI9Yjl8uxWhXEYrHARBeMZiMqkTYidgYOHIjVSikOqJcRaSPiCcZK0ONVKxAGHFAvI9JGxNnIkSO3b9+OWppjgojp2TfQRkSM16VLFx8fH9TSoGoeP348ohhpI2LNdtoVFI2ohZhMpocPH7Zr1w5RjLQRaWDdunVbt26tu8VhS486pqeCyFwzXRge43A4QqFwyJAhxcXFgwYNWrp0KaLY7t27c3JyHHDJPWkj0gP/sV69erm7u5eUlLBYrNTU1PLycplMhqgEJWJsbCyiHmkj0gmMdRcVFdmeQwodcCcfx3SZEWkj0sjLL79c99ol+PucPHkSUQkaA7m5uW3btkXUw7RqhnFELhfTn61FQMcZ2mro8S3NbFvgCWzJzs4OCQlB1HBYTwWRuWa6OHDgAGQRpv5sCyPB/C88QpeF0trZYfUywrZEhDZiQEAAmVypa+HChfB4586dC4+VlZVVVWgTTl8b9dJERI37qb/CoLqqwoT+KhiSkcqalDG8hm/69esHrcPaHwn6hvDc19f3+PHjiKgj6WT5nYsVFpbJpLcKKbs+GkazOVzu37mA1MNPkJ+pDe0k7j5ELpXxGjkSrxKxR48ekLnaZhB63BIaNmwYIur4+cciiYw3eGqwxJ2HsGcyWipLDHv/kzfqHwEe3g3ecwSvNiLMadZbSyAwMNABE5008tPmIg9fQafeclqkEHB5bM8Al7HvtjmwJl9Z3uDqHXgFMSoqqu4iiFA1v/DCC45ctxRzj9I0fCEn8mkPREPPjfNLPF7e0F7ses2vvfZa7cJLUBzifPcexyvJ1fMEdF1/38NH8CBZ1dBe7H4rGLjq2LGj7fngwYM9PGj57qeIXmv29BMgeuJwWcHh4spSg929OL69pkyZAnNZ0FkmxWE9GqXZROc10sqLDQ0t4/R3e80FWdoqhUmjMmmVZosZOvwW1AzkvcJnwYB20k96GLVFf5tAyGYhlkjKgQ+5v8DLn66FihP7i0HMuafJuKnOTtF4+AqtVhaHx2HDB4fTXKOS0R37wqOqmWab1VqWxWw255vMBp1RV2XUmdt2FEfEuPq0coblkJ3DEwex8GH1+QNlPBGfxRW0fcaDy+MgujFUm8oUmoSDFUIRenaE3N2L3Na55T1ZEE/tLC3I1snbyMQeNC5L+EKuLKjmfEdliSZ+dUH7bq49hsoR0aKa2lmB8fHNS3J0ZkFwV39ap7Auqbe47TNBJUVsGGtFRItqUhDNJuv6+dl+kT4SuROeEeMeIOW5SXctp8eCmc7qz4NosVjXzsuK7N9GIKbHnNJfIJGLpAGyHz/LQUQL+fMgbv/i13Y9ApCzE7m7yILcj22k0wLrzuRPgnguXuEe5C4QM6Jf6eotMSJBckIlIhyusSCWFegfpmhcvSSIMdz93S4eVNDu1sFOoLEgnj9Y5tmG2qsVMeQb5nHhYBkiHKvBIBY9qjaZ2a5eIoSl5Lun5i7srtZUoObm2do9P1uvrzYj4rERowZs2Ur5zXIbDOKD2xqYuUPMxGI/StUip7B4yQfHfzqEsNdgELPuaFy9MS0OqSaSiTOT1cgp3L+fhujA/hRfRYlB6MqjrrP86Nc7v5zdkJuXJhF7tA/vNfC5aS4uNUPllxL3nkzYNGvq2i275heXZPv5hPbuMT6263+v5Tv68+qk28cFfFGXjoO8PYMRZaTeosJUTNdVfyLP9a9Z8PNfyz9du27FkUPn4PmlSwk/blmf8+tDNzf30NDwOW/908fH13ZwI7tqJV69tHv3lvT7qTKZZ3R0pxnT3pLLm+f2sfZLRHWlSVfdLCd02aEoy/1u81tGo/7NGRsmT/iysDhz7aZZZnPNNYscLq+6WnXw2PKxIz7815LEjtH99hz8rKKyZpGNy9fiL1/bN+rF9+fM/EHu4X/y7EZEGRaLpa4wapR//TJKTPx8/BI8vj93oS2FSTeufrzo/YEDX9yz6/gnC5cVFxeuXLXMdmQju2plZKbP/3BOly6xmzfte/uteVlZGV9+tQg1E/tB1CrNHMpOq7l5+2cuhzdl/Jc+Xq19vUPGDF+QX3g/5V6Cba/ZbHz+uWmtgjpAGmI6vwgjKfmFGbD94pU9HaP6QzRFIimUkaEhMYhKfBeOpor2Qaxn0w9rez/bb/TLE6DMi4rqOHvWu4mJF9Mf192N7KqVcjfZxcXl1YlToaTs3q3H1/9aO378FNRMGgiiysThU3WlKdTLQYGRYvF/L4mSefjJZYEPc5JrDwgOiLI9EQml8FitU0EcFeW5Pt5tao8J9I9AVOIJOVr6l4j1ZGdnRkRE1X4aHlaznEh6emrju2pFd+is0+nmL4jbu297Xn4uRLZL52YrDhpMGwtRNahbrVPn5qfB4EvdjUrV70N3fzybXKfXWCxmgeD3zhOfL0RUsphrfg7kRNRqtV6vFwh+P3NKJKr5e2q1mkZ21f0OYe0iln2x6vz50+u/X/3t2hVPde02ZfJMaCmi5mA/iCIp12zUIWq4usrbtOo8qN+MuhvF4sYWRHQRiNlsjrHOj6Q3UDu8YjaYxVKnWgXK5fGCEDpdde0WzeOcyWWejeyq902gRoaP16e8cePG1fj9Oz9cEHdg/ykOpxlacfarZpErx2ykakTX36ddZVVRSOsuoSFP2T4kEg9vz9aNfAmUkR7ufo9+vVu75d79S4hKBp1ZJKXfyeeN4HK54WHtU1Pv1G6xPQ9p266RXXW/Q3LyjavXLsMTT0+vQYOG/mP2eyq1SqEoRc3BfhClMi6PT1XFBCMyFovl8E8rDAZdSWnO0RPffP3NhMLiB41/VafoAXfTzsKECjw/c2FLTl4KoozFYpW4c52gRBQIBF5e3klJibeSk0wm08gR4y5eOhcfv1OpUsKWb9f+u2uX2HahNbeUamRXrZTU24sWzztydH9lZUXavZT9B3ZBIuEDNQf7f2s3T75JZ9apDC6uzT+UCN3euW/uOHth68p1k0tKHwUHRo0ZseBPOx8D+ryu0VQcPP71tj0LoGZ/aXDcjr0fU3R2grJY4+HtJLNKEydM/WHzumvXL+/ccRRGZ0oVJbv3bv3m26+h5xvz1NPTp71pO6yRXbXGjnkVIvjNmuX/XrGUz+f3e27Qin+vb5Z6GTWyGtiVY2V5j6xeIUy8vr0gtSS2v6RdF1eEmZ9/LPJvK2nTga7nQx1YnTP8DX83Tztv8gan+EI7ia0mZxu/aCIWy9wmiiwT6lANNoO8Al2EImtVscbNx/6/pLKqZPk39tfpEgok1Xr7c7W+XiFvzvgeNZ+PPu/f0C6YreFw7PyC0BiYMXlVQ19Vml3RJlLI5dN1iRmaaqw93nuU576V+Q0F0VUie3f2Vru7oBfC59u/0o/NbuYeQEM/Q82PYdTzeXYWdeByG2z4WsyW0odVY/7hiOXLiboai4WbnNe+u6SsVOXqZae1BIWNzMMftbTm/RmUhVV9xzTPLD7xRP6kAuox1FOrUGsrqRrcxkpVoVIitkR2J/caagF/3hIa927gr7eKjDon77hUFqmry9UDJngjoiU0qUk+88uQzEu5TlwuVhWpkU7zytwgRLSQJgURZthmLw9V5pcri1XI6VTkVvBZ1SNmtXx7l8meYJACCgy53JydmKcscZKbk1XkK9PP5bQJ5w6e4ouIFvVkgyk9h8kju7ueP1CmyNJaOTypl5iO65BUK/WqUq1Fr/f05w1Z1EogdKqTG2jqiUf1PLz5w2f6FT3SZSars+4UC0Rci4XF4XNq1urkwn8Ux0vToWlhMpotBpPJYDZUGwVCdrvOkrCuXmRlRHz8xeFl39Yu8PHsCM/yIkOVoubyDk2VyWyymE04BpHvwmJz2GKpSCTleAbwJW5MvUwWY393nkPmy4cPRBB/D7kVLZ2I3bi0XvRA5itoqPFGpvbpRChmK/L1iJ6MBktehsbN0379SYJIJz6tXIx6ui7KU16kb+QUTxJEOgkKE7FY6NYZWi5WdmZHQc+XGlw0H6/7NRNNcX5/qdFobdtRKvenwar6MKJSVao/u6to0oJgccPjFSSItJRypSr1slKnNespWxmmWXgFCCpLDG06iHsO82z8dpYkiDQG/zqDDusgWi1WF3GTJq5IEAkskHFEAgskiAQWSBAJLJAgElggQSSwQIJIYOH/AQAA//9o/3S1AAAABklEQVQDAN8NBMrfUL9YAAAAAElFTkSuQmCC", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "execution_count": 38 }, { "cell_type": "code", "id": "5987d58c", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:35:11.211429Z", "start_time": "2025-05-24T21:32:55.574557Z" } }, "source": [ "question = \"A paper about AI regulation that was originally submitted to arXiv.org in June 2022 shows a figure with three axes, where each axis has a label word at both ends. Which of these words is used to describe a type of society in a Physics and Society article submitted to arXiv.org on August 11, 2016?\"\n", "messages = [HumanMessage(content=question)]\n", "messages = graph.invoke({\"messages\": messages})" ], "outputs": [], "execution_count": 39 }, { "cell_type": "code", "id": "330cbf17", "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:35:11.283288Z", "start_time": "2025-05-24T21:35:11.274357Z" } }, "source": [ "for m in messages['messages']:\n", " m.pretty_print()" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================\u001B[1m Human Message \u001B[0m=================================\n", "\n", "A paper about AI regulation that was originally submitted to arXiv.org in June 2022 shows a figure with three axes, where each axis has a label word at both ends. Which of these words is used to describe a type of society in a Physics and Society article submitted to arXiv.org on August 11, 2016?\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "Tool Calls:\n", " arvix_search (cf1968e8-053c-4d75-a59f-eb196dfed2a4)\n", " Call ID: cf1968e8-053c-4d75-a59f-eb196dfed2a4\n", " Args:\n", " query: AI regulation June 2022 figure three axes\n", "=================================\u001B[1m Tool Message \u001B[0m=================================\n", "Name: arvix_search\n", "\n", "{\"arvix_results\": \"\\nFairness in Agreement With European Values: An\\nInterdisciplinary Perspective on AI Regulation\\nAlejandra Bringas Colmenarejo1, Luca Nannini2, Alisa Rieger3, Kristen M. Scott4, Xuan\\nZhao5, Gourab K. Patro6, Gjergji Kasneci7, and Katharina Kinder-Kurlanda8\\n1University of Southampton, United Kingdom\\n2Minsait - Indra Sistemas and CiTIUS, Universidade de Santiago de Compostela, Spain\\n3Delft University of Technology, Netherlands\\n4KU Leuven, Belgium\\n5SCHUFA Holding AG and University of Tuebingen, Germany\\n6IIT Kharagpur, India and L3S Research Center, Germany\\n7SCHUFA Holding AG and University of Tuebingen, Germany\\n8Digital Age Research Center, University of Klagenfurt, Austria\\nJuly 5, 2022\\nAbstract\\nWith increasing digitalization, Artificial Intelligence (AI) is becoming ubiquitous. AI-based\\nsystems to identify, optimize, automate, and scale solutions to complex economic and societal\\nproblems are being proposed and implemented. This has motivated regulation efforts, including\\nthe Proposal of an EU AI Act. This interdisciplinary position paper considers various concerns\\nsurrounding fairness and discrimination in AI, and discusses how AI regulations address them,\\nfocusing on (but not limited to) the Proposal. We first look at AI and fairness through the\\nlenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various per-\\nspectives. Then, we map these perspectives along three axes of interests: (i) Standardization vs.\\nLocalization, (ii) Utilitarianism vs. Egalitarianism, and (iii) Consequential vs. Deontological\\nethics which leads us to identify a pattern of common arguments and tensions between these\\naxes. Positioning the discussion within the axes of interest and with a focus on reconciling the\\nkey tensions, we identify and propose the roles AI Regulation should take to make the endeavor\\nof the AI Act a success in terms of AI fairness concerns.\\n1\\nIntroduction\\nAI applications have grown at an unprecedented rate in recent years and have become ubiquitous in\\nour society. While often deployed with the intention to increase efficiency and fairness of decision-\\nmaking, AI has also sparked many debates on (un)fairness [101]. These debates surround, amongst\\nothers, unfair treatment of individuals and groups due to the reproduction of systemic, institu-\\ntional, and societal biases in AI decisions [14]; the opacity of AI decisions [4]; diverse jeopardies\\nto democracy and societal well-being [74]; risks to consumer privacy [63]; and market inequalities\\nthat are observed in the aggregation of unprecedented levels of power of big companies that develop\\nAI systems (Big Tech) while small and new companies are struggling to enter the market [100].\\nIn many fields of AI application, such as policing, justice, and recruitment, bias and unfairness as\\n1\\narXiv:2207.01510v1 [cs.CY] 8 Jun 2022\\ndescribed above should not only be mitigated to increase fairness but in fact, to avert violating\\nprotected human rights.\\nThe above mentioned undesired effects and consequences of AI application and development\\npropelled the European Union for new regulations, ex-ante reviews, and ex-post monitoring on AI\\nsystems. The European Union intends to assert the AI Regulation through the protection of human\\ndignity and fundamental rights with the Proposal of the Artificial Intelligence Act [85], convinced\\nthat human beings should remain at the center of technological development. However, to make this\\nendeavor of the AI Act a success, to some extent divergent interdisciplinary views and perspectives\\non bias, fairness, and regulation, have to be taken into consideration.\\nWe elaborate on legal, industrial, sociotechnical, and philosophical perspectives in light of iden-\\ntified axes of tension in the debate on AI fairness and regulation: Standardization vs. Localiza-\\ntion, Utilitarianism vs. Egalitarianism, and Consequential vs. Deontological. Further, we discuss\\ndiscrepancies between how these perspectives are addressed in the current Proposal of the Artificial\\nIntelligence Act and make recommendations how they could be addressed for better reconciliation\\nwith all three perspectives and the legal requirements. In sum, we make the following contributions\\nto the ongoing discourse on AI fairness and regulation: i. Interdisciplinary perspectives: Com-\\nprehensive interdisciplinary (technical, legal, industrial, sociotechnical, philosophical) discussion of\\nbias, fairness, and regulation (Sections 2 to 6), ii. Mapping tensions of debate: mapping the dif-\\nferent perspectives on fairness in AI applications and regulation on to three axes that reveal tensions\\nin the debate: Standardization vs. Localization, Utilitarianism vs. Egalitarianism, and Consequen-\\ntial vs. Deontological ethics (Section 7), iii. Path forward: Recommendations towards consensus\\nfor a successful AI Act that reconciles divergent perspectives (Section 8).\\n2\\nTechnical Frameworks for Bias and Fairness in AI\\nIn this section we present examples of fairness controversies for selected AI application domains with\\nhigh-stake consequences. Subsequently, we discuss several AI fairness notions and present research\\non guidance to choose between these notions and between measures to mitigate bias in AI systems.\\n2.1\\nExamples of Bias and Unfairness in AI Applications\\nAutomated decision-making systems were suggested to be capable of increased fairness due to avoid-\\nance of human bias interference [52]. However, many cases have come to light in which automatic\\ndecision-making was found to raise critical issues regarding fairness, and reproduces systemic, in-\\nstitutional, and societal biases. Such biases can result in discrimination, unfairness, and issues of\\nprivacy, thus, violating protected human rights (see Section 3). This is especially harmful when au-\\ntomated decision making has high-stake implications for individuals and society. In the following,\\nwe present salient examples.\\nIn Policing and Justice, AI systems are applied across Europe to inform and assist day-\\nto-day police work by profiling people, attempting to predict likely future behavior or locations\\nof future crimes, and assessing the alleged risk of criminal involvement of individuals (e.g., Top\\n600 criminals list and CAS (Netherlands), Delia (Italy), SKALA (Germany). Outcomes of these\\npredictions and assessments are used to justify surveillance, searches, or questioning of alleged high\\nrisk individuals. However they have been suspected to reinforce existing patterns of offending and\\nenforcement [1, 99]. In the judicial arena, automated decision-making is currently being applied\\nin various courts around the world to support certain tasks, such as risk assessment of recidivism,\\nas well as decisions concerning bail amounts, probation periods, and sentencing [94, 116]. Across\\nEurope, such systems are not yet used widely, however, they have been introduced or tested in\\n2\\nsome countries, e.g., in Spain (RisCanvi) or the UK (HART). Završnik [116] highlights potentially\\nviolated rights due to opaque, automated decision-making in the justice system, e.g., the right to a\\nfair trial, the principle of non-discrimination and equality, and the right for explanation.\\nAI systems are further being applied in the domain of Education and Employment, to sup-\\nport candidate selection for higher education admissions and recruitment, e.g., with CV screening,\\ntargeted job advertisement, candidate sourcing, and video screening [2]. The risk of bias has been\\ndemonstrated at each of these stages in the recruitment process [11, 56].\\nIn Finance and Banking, AI algorithms constitute the basis of numerous different applications,\\nsuch as market forecasting for trading, or risk management for credit scoring, loan allocations, and\\nmortgage rates [15]. Various cases have come to light in which decisions of such applications were\\nfound to be unfair and biased towards minority borrowers, i.e., with higher mortgage and loan\\nrejection rates for Hispanic and Black borrowers in the US [8, 29], or lower credit limits for women\\nthan for men with equal credit relevant characteristics [36, 106].\\nFor Online Platforms, AI based recommender systems are applied to support users to navigate\\nthe web by filtering information and suggest items (videos, social media content, products, music,..)\\npredicted to be relevant for the user. Recommender systems were found to amplify different kinds\\nof bias, such as representation bias with an over-representation of male, white, and young users [96],\\nand exposure bias where the top 20% of businesses get 80% of the exposure [87], and marketplaces\\npreferentially recommend their own products [21].\\nThis amplifies substantial power imbalances\\nbetween market-dominating platform incumbents (Big Tech) and smaller platforms who do not have\\naccess to equal vast amounts of high-quality consumer data that is vital to enter the market [100].\\nThe resulting immense power concentration in the private hands of very few companies that develop\\nmost AI applications and prioritize profit over benevolence for society poses an additional threat\\nto democracy and society [28, 105]. Further, recommender systems and search result rankings that\\noften optimize to capture attention, determine a large extent of the information to which people are\\nexposed. This can result in distorted exposure to information and viewpoints, as well as exposure\\nto dis- and misinformation, raising issues of fairness and posing a threat to democracies that are\\nreliant on well-informed citizens who can engage in healthy political and social discourse [42, 74].\\nAI systems could threaten democracy and society further by undermining the process of elections\\nthrough targeted advertisements. Such microtargeting provides tools for interference by malicious\\npolitical actors [23, 73].\\n2.2\\nMitigating Bias and Ensuring Fairness\\nMost fairness definitions consider either group or individual fairness. Group fairness is focused on re-\\nquiring that people who belong to protected groups receive on average the same treatment/outcome\\nas the overall population, expressed as the equality of a selected statistical measure across groups [111],\\nsuch as statistical parity, demographic parity, equal opportunity and equality of odds. Individual fair-\\nness focuses on ensuring that any two individuals who are similar except for the protected features\\nreceive equal or similar treatment/outcomes [26]. While ideally, multiple fairness notions would\\nbe met to reach a complete fairness status, this is impossible due to mathematical incompatibili-\\nties between them [68]. Criteria to systematize the procedure of selecting between fairness notions\\nwhen making a specific decision have been proposed: Amongst others, the existence of a ground-\\ntruth, base-rates between sub-groups, the cost of misclassification, or the existence of government\\nregulations to meet may be considered [62].\\nFormalization of fairness definitions in a specific context is nuanced and it is important that AI\\npractitioners receive some guidance when designing a fair AI system. Some recent research proposes\\nthe Fairness Compass, a schema in form of a decision tree which simplifies the selection process\\n3\\nby settling for the desired ethical principles in a formalised way [98].\\nA standardized roadmap\\ncould potentially make the identification of an appropriate fairness definition a more straightfor-\\nward procedure, and help document the decision process toward fairness. Audit, monitoring and\\nexplanation might then be more accessible and less expensive. Nevertheless, there should also be\\nspace for stakeholders with deeper understanding of the specific context to contribute refinement\\nand interpretations of any such roadmap.\\nThe fairness notions mentioned above deal with the outcome of automated decision-making.\\nCounterfactual fairness [55] and causal fairness [112], however, have a procedural implication which\\nmight be more suitable for the cases where a counterfactual or causal connection needs to be\\nestablished between features. Most of the existing fairness notions are formalized in a static scenario.\\nIf we want to better understand how bias is encoded in historical data or evaluate the consequences\\nof certain fairness intervention, dynamic fairness notions [20] might offer a better solution.\\nTechnical methods to mitigate bias in algorithms fall under three categories: (1) Pre-processing.\\nPre-processing techniques try to transform/re-balance the data so that the underlying discrimina-\\ntion is mitigated; (2) In-processing. The construction of objective function usually has Utilitarian\\nmotivation behind, e.g. trying to maximize the utility of whole population. In-processing methods\\nfor bias mitigation can be used either by incorporating changes into the objective function or impos-\\ning a fairness constraint; (3) Post-processing. Post-processing methods reassign the labels initially\\npredicted by the black-box model to a fairer state. [66].\\nThe existing technical solutions toward fairness focus on more consequential approaches: the\\noutcome/decision is evaluated by a specific fairness notion and then measures are taken to correct\\nthe unfair outcome/decision. Concerns have been voiced that fairness cannot be simply achieved\\nthrough mathematical formulation approaches as the formalism trap [103] and the seeming success\\nof these technical solutions in the end will hinder pursuits of actual fairness with the cooperation\\nof social practices [39].\\n3\\nA Legal Perspective on Bias and Fairness in AI\\nTo follow one central goal of the EU—the promotion of peace and well-being for its members—\\nEU law aims at ensuring that EU member-states and individuals are treated and treat each other\\nequally and fairly. The blindfolded Justicia further emphasizes the importance of laws that promote\\nfairness, but also fairness within the enforcement of all laws. Decision-making based on machine-\\nlearning could be a promising support for that, to mitigate the unconscious or deliberate biases that\\nwe as humans have. However, being trained on (biased) data from previous decisions, the promise\\nof unbiased assessments could not be fulfilled so far [5, 46].\\nIn this section, we will take a structured look at the legal perspective on bias and fairness in AI.\\nWe will start with an overview of EU legislative framework on non-discrimination and the approach\\nto fairness followed by the EU Data Protection Law. Then we will conclude by addressing the\\ntechnical requirements to deal with bias that would be introduced with the AI Regulation Proposal.\\n3.1\\nNon-Discrimination Law\\nThe general principle of non-discrimination in EU law protects people from discrimination and unfair\\ntreatment. European anti-discrimination law is designed to prevent discrimination against particular\\ngroups of people that share one or more characteristics—called protected attributes—and from which\\nthe group acquires the category of a protected group. Concretely, protected attributes under the\\nCharter of Fundamental Rights of the European Union include sex, race or ethnic origin, colour,\\nethnic or social origin, genetic features, religion or other belief, disability, age, sexual orientation,\\n4\\npolitical or any other opinion, language, membership to a national minority, property, social origin,\\nand birth (Art. 21.(1)) [82]. Additionally, the Charter prohibits discrimination on the grounds of\\nnationality, compels the European Union to ensure the equality of everyone under the European law,\\ndemands the respect of cultural, religious, and linguistic diversity, and seeks equality of men and\\nwomen in all areas. Several other European anti-discrimination directives have further covered the\\nlegal protection offered to these protected attributes. Specifically, under the European Legislation\\nmen and women must receive equal treatment in the labour market and regarding the access and\\nsupply of good as services[79, 83]. Likewise, equal treatment must be guaranteed between persons\\nirrespective of their racial or ethnic origin [78], as well as equity shall be respected in employment\\nand occupation in regards to the grounds of disability, religion or belief, age and sexual orientation\\n[77]. Member States expanded the protection towards discrimination through specific national laws\\nand provisions.\\nFurthermore, the European legislation presents two tools to address discrimination, direct and\\nindirect discrimination. Direct discrimination is defined as a situation in which one person is treated\\nless favourable on the grounds of a prohibited criterion than another is, has been or would be treated\\nin a comparable situation [78]. Thus, it is straightforwardly related to the possession of a protected\\nattribute that distinguishes the person from other individuals, regardless of the intention behind\\nthe disparate treatment or the mere existence of less favourable treatment. In the context of data-\\ndriven systems, direct discrimination will cover those cases where the model is not neutral towards\\na protected attribute and offers a less favourable output to individuals on the basis of protected\\ngroups, whether they truly fit into that group or are associated with the protected attribute. Since\\nconsciously inputting discrimination into the model will affect its accuracy, these cases are not of\\ngreat concern [113].\\nBy contrast, indirect discrimination will more likely capture many situations of algorithmic dis-\\ncrimination because it affects situations where an apparently neutral provision, criterion or practice\\nwould put members of a protected category at a particular disadvantage compared with other persons\\nunless that provision, criterion or practice is objectively justified by a legitimate aim and the means\\nof achieving that aim are appropriate and necessary [78]. Nevertheless, the prohibition of indirect\\ndiscrimination does not encompass a set of clear and easily applicable rules, it can rather be con-\\nsidered closer to a standard than to a rule [118]. The concept of indirect discrimination results in\\nrather open-ended standards, which are often difficult to apply in practice. It needs to be proven that\\na seemingly neutral rule, practice or decision disproportionately affects a protected group [118]. Due\\nto this, indirect discrimination concerns neutral models, which in principle are blinded to sensitive\\nattributes or do not operate on the basis of those protective attributes. Thus, direct discrimination\\nfocuses on individual cases of discrimination, while indirect discrimination deals with rules and\\npatterns of discrimination and can reveal underlying social inequalities.\\n3.2\\nData Protection Law\\nThe European Union General Data Protection Regulation (GDPR) [84] refers to automated indi-\\nvidual decision-making and seeks, amongst other objectives, to prevent algorithmic discrimination.\\nGenerally, the GDPR states the objective to protect all the fundamental rights recognised under\\nEU law, which the processing of personal data may challenge. According to the GDPR, the core\\nprinciples that shall lead the processing of personal data are lawfulness, fairness, and transparency.\\nConcretely, the principle of fairness entails the processing of personal information that is not in any\\nway unduly detrimental, unexpected, or misleading to the individuals concerned ([48]). Indeed, the\\nprinciple of fairness seeks to protect the individual’s fundamental rights and freedoms, and so, their\\nnon-infringement by such processing. Likewise, the principle of data accuracy requires the control of\\n5\\nthe quality of data for its processing, although it does not address the possible wrongful or dispropor-\\ntionate selection of data and therefore the effect and consequences resulted from such selection [76].\\nTo ensure fair processing, the GDPR requests the use of appropriate mathematical and statistical\\nprocedures for profiling that take into account the risks involved for the interest and rights of data\\nsubjects and prevent discriminatory effects on natural persons (Recital 71 [84]). Furthermore, the\\nGDPR highlights the potential risks to the rights and freedom of natural persons, which could lead to\\nphysical, material or non-material damage, in particular when processing results in discrimination\\n(Recital 75 [84]). Despite these provisions, ensuring fairness is still quite a subjective matter as it\\nrequires that the data processing shall not exceed reasonable expectations nor provoke unjustified\\nadverse effects on the individuals. However, what can be considered reasonable expectations and\\njustifiable effects is an open question, leaving the notion of fair processing undefined.\\nHowever, the European anti-discrimination law evidently embedded notions of substantive dis-\\ncrimination and therefore, unjustified algorithmic discrimination, as referred to in Article 5 and\\nRecital 71, implies unfair processing [38]. From the legal perspective, discrimination collides with\\nequality, infringing the principle of fairness; whereas from a technical perspective, algorithmic dis-\\ncrimination straightforwardly entails unfair processing (see Section 2).\\n3.3\\nEU Artificial Intelligence Regulation Proposal\\nWith the EU Artificial Intelligence Act the European Union aims at laying down harmonized rules\\non artificial intelligence with four specific objectives [85]:\\n1) ensure that AI systems placed on the\\nUnion market are safe and respect existing law on fundamental rights and Union values; 2) ensure\\nlegal certainty to facilitate investment and innovation in AI; 3) enhance governance and effective\\nenforcement of existing law and safety requirements applicable to AI systems; 4) facilitate the de-\\nvelopment of a single market for lawful, safe and trustworthy AI applications preventing market\\nfragmentation.\\nIn essence, the Proposal seeks to balance legal certainty and the development of AI systems\\nwhile ensuring an approach that respects European values, principles and laws. The specific pur-\\npose of the Proposal is to establish a classification for trustworthy AI systems based on a risk-based\\napproach, to introduce new legal obligations and requirements on public authorities and businesses\\nfor the development and application of AI systems, to prohibit harmful AI-enabled practices, and\\nto set new monitoring and enforcement regimes. Essentially, the Proposal will set a legal framework\\napplicable for developers and end-users of AI systems which specific characteristics—opacity, com-\\nplexity, dependency on data, autonomous behaviours—can adversely affect a number of fundamental\\nrights enshrined in the EU Charter of Fundamental Rights [85].\\nThe Proposal delimits a set of prohibited AI practices considered harmful because they contra-\\nvene EU values and violate fundamental rights. Second, the Proposal outlines specific obligations\\nto avoid the appearance of bias in two types of high-risk AI systems; (1) those which are intended\\nto be used as a safety component of a product or is itself a product, and this product is subject to\\nan existing third-party conformity assessment, and (2) those which are involved in decision-making\\nprocesses in the following areas; (i) biometric identification and categorization of natural persons,\\n(ii) management and operation of critical infrastructure, (iii) education and vocational training, (iv)\\nemployment and workers management as well as access to self-employment, (v) law enforcement,\\n(vi) migration, asylum, and border control management, and (vii) administration of justice and\\ndemocratic processes (see Section 2.1).\\nAccording to the Proposal, AI systems can only be placed into the EU market if they comply with\\nthe certain minimum requirements specified in the legislation, requirements that become stricter as\\nthe risk associated with the system increases (i.e., minimal risk, low risk, high risk, and unacceptable\\n6\\nrisk). Consequently, providers will need to carry out ex-ante conformity assessments and implement\\nquality and risk management systems and post-market monitoring to ensure compliance with the\\nnew regulation and minimise the risk for users and affected persons. However, the Proposal pays\\nlittle attention to identifying the causes and proposing recommendations to tackle the potential\\ndiscriminatory harms of AI systems. Specifically, the Proposal mainly focuses on biases in data\\nsets, forgetting other types such as those that may arise from the choice of algorithms, and the\\noptimization or evaluation of metrics. Additionally, the Proposal may pose unreasonable trust in\\nhuman operators—i.e., human in the loop—to identify and recognise cases of bias and discrimination\\nin AI systems.\\nThe Proposal does not provide detailed guidance on dealing with unavoidable trade-offs for\\nthe different stakeholders when debiasing and monitoring bias in the data set. Nevertheless, some\\ninsights can be found in the Proposal regarding the expected requirements to debias high-risk\\nAI systems.\\nFirstly, there will be an obligation to establish appropriate data governance and\\nmanagement practices concerning the training, validation, and testing of data sets, in particular, to\\nexamine possible biases, ensure the relevance, representativeness, absence of errors and completeness\\nof the data sets, and their consideration with the characteristics or elements that are particular to\\nthe specific geographical, behavioural or functional setting within which the high-risk AI system is\\nintended to be used [85]. Secondly, a novel exception to the Data Protection Regulation will allow\\nto the extent that it is strictly necessary for the purposes of ensuring bias monitoring, detection and\\ncorrection in relation to the high-risk AI systems [85] the processing of special categories of data.\\nFinally, the Proposal asks for developing methods that will ensure the detection of biased outputs\\nand the consequent introduction of appropriate mitigation measures as it recognises the potential\\nof AI systems to develop biased outputs due to outputs used as an input for future operations, i.e.,\\nfeedback loops.\\nInterestingly, the Proposal also details the role of standards and specifications in the AI landscape\\n[85]. On the one hand, the Proposal addresses the use of harmonised standards to presume the\\nconformity of AI systems with the regulation’s requirements. On the other hand, the Proposal\\nentitles the Commission with the duty to adopt common specifications and technical solutions\\nwhen the harmonised standards are insufficient or there is a need to address specific or fundamental\\nrights concerns. In other words, conformance with technical standards and common specifications\\nshould give providers of high-risk AI a level of confidence that they are compliant with the mandatory\\nrequirements of the proposed EU AI Regulation as well as significantly cutting the cost of compliance\\nfor business [65].\\nWhereas neither the standards nor the specifications will be compulsory for\\nproviders of high-risk AI systems, their non-adoption shall entail a justification as to which and\\nwhy other technical solutions were adopted.\\n4\\nAn Industry Perspective on Bias and Fairness in AI\\nSubstantial research on ML fairness, even for industry applications, has originated out of academic\\ncontexts. Academic research has first proposed most fairness principles and quantitative methods\\nto mitigate biases and unbalanced data with general application domains [6, 59, 66]. Toolkits ap-\\npeared ready to be integrated for the industry, even if often developed following non-contextual\\ndesign rationales based upon the issues of algorithmic methods [43]. Until recently, the technical\\nnature of academic contributions have often not addressed the practical issues that industry prac-\\ntitioners face when adopting and engaging with fairness tools. Practitioners have pointed out the\\nlack of ethical tools’ usability in real-world applications due to a series of critical factors preventing\\nthe straightforward adoption of fairness principles and methods [69]. Following Morley et al. [71],\\n7\\nsuch non-effectiveness in real-world cases stems from how fairness compliance is operationalized\\ninside companies. If not developed with the sociotechnical features and constraints of AI product\\ndeployment in mind, these methods could easily lead to failures [43] including for example fair-\\nness definitions misinterpretation [54], obfuscation of practitioners’ accountability [81], and gaming\\nfairness measures as a method of ethics-washing [71]. To avoid shortcomings, researchers are now\\nfocusing on how to operationalize fairness frameworks based on the needs of industry practitioners.\\nVeale et al. [110] conducted interviews with decision makers in high-stakes public-sector contexts.\\nPractitioners were found to be lacking incentives and practices for algorithmic accountability due\\nto resource constraints and dependency on prior infrastructure. Holstein et al. [44] enlarged the\\npool of industry practitioners with a systematic investigation of ML product development. Amid\\nthe area of intervention were identified issues of data quality provenance and reporting, as well as\\nthe need for domain-specific educational resources and compliance protocols, intended specifically as\\ninternal auditing processes and tools for fairness-focused debugging. Rakova et al. [92] reported that\\npractitioners often felt a hostile organizational environment where they were hindered or uncom-\\npensated when trying to implement fairness practices independently. Disincentive stems from the\\nlack of educational programs, rewards, accountability allocation, and communicative protocols over\\nfairness issues, especially when different parts of an AI development are distributed across different\\nteams. This resulted in practitioners often feeling disoriented, unprepared, or even overwhelmed by\\nfairness tools and checklists [19, 44]. It was also observed that practitioners recommend establishing\\ninternal and external investigation committees to create an inclusive and preventive environment\\nand to provide resources such as protocols or educational teams [61, 92]. Other research examples,\\nonce informed on practitioners’ needs, focused on designing different AI fairness solutions: check-\\nlists to be aligned with teams’ workflows and organizational ad-hoc processes, fairness frameworks\\nor internal algorithmic auditing protocols designed for industrial applications [61, 91]. Recently,\\nRichardson and Gilbert [97] proposed a complete industry framework of stakeholders and fairness\\nrecommendations while specifying operationalization pitfalls. Ibáñez and Olmeda [47] distinguished\\ntwo main perspectives on operationalizing fairness practices in organizations: a bottom-up, reactive\\napproach, where prior organizational processes restrain best practices, or top-down, where a proac-\\ntive approach is set in place according to the translation of principles and methods as actionable,\\niterative steps designed with stakeholders’ needs and concerns in mind. Interestingly, the literature\\nagrees that fairness interventions should not be standardized and reactive to prior single instances of\\norganizational infrastructure issues, but proactive, based on a thorough understanding of different\\nstakeholders’ needs, and accounting for domain-specific and contextual factors.\\nIn regards to the Proposal, it is not yet clear how fairness practices will be effectively opera-\\ntionalized given the mechanisms envisioned in Articles 43 and 61 from the Proposal, respectively\\nfor conformance checking and post-market monitoring of high-risk systems.\\nFor those systems,\\nproviders will be demanded to draft and verify their conformance through a quality management\\nsystem, technical documentation, and post-market monitoring under the lens of a national body.\\nThis body will be guided by a national supervisory authority in coordination with the EDPB (Eu-\\nropean AI Board from the EU commission). Yet, some detractors, in line with some concerns over\\norganizations’ ethics washing, advanced skeptical doubts on the procedural efficacy of these audit-\\ning mechanisms [60, 64]. Doubts were related to the undisclosed nature of conformity declarations\\nas well as the nature of contributions of data criteria input to the EU database for stand-alone\\nhigh-risk AI systems in Article 60, withheld from the scrutiny of those affected by such systems\\nand available only upon regulatory bodies’ request. This loose gravity towards the public interest\\nmight not permit to enforce EU citizen fundamental rights to decide whether a system should be\\nlisted as high-risk. In light of the concerns for more structural fairness practices, the evolution of an\\noverly rigid and costly compliance environment could critically undermine these needs. An official\\n8\\nimpact assessment has been proposed [95] to quantify these costs. Mueller [72] advanced an analysis\\nof the economic costs that could arise for EU small and medium enterprises and corporations. In\\nthe forecast, effects will push away venture capital investors, drain European talents and tighten\\nstronger external dependencies leading to a highly unfavorable European environment, with the risk\\nof being excluded from the global AI market. Academics and policy analysts have advanced a de-\\nbate on the validity of those claims, picturing less-burdening assessments over quality management\\nsystems, thus calling the report factitious [37, 57]. Future predictions will need to account both\\nfor amendments to the terminology and procedures. Foremost, central analysis focus should be\\ngiven to the ecosystem of digital technology regulations that the EU has on its agenda [80]. These\\ndigital Proposals constitute the European intention of enforcing its legislative sovereignty and set\\nstandards for the international market. Leveraging the Brussels Effect [12, 31] and the current rise\\nof AI ethics attention across a wide range of institutional and academic stakeholders [35, 102], it is\\nreasonable to predict that in the near future current investments in integrating fairness governance\\npractices could be streamlined into more mature and efficient regulatory frameworks with lower\\nprocedural costs while mitigating reputational risks [92].\\n5\\nA Sociotechnical Perspective on Bias and Fairness in AI\\nRegarding AI fairness and discrimination, many have pointed out that AI is not merely a tool, it is\\na sociotechnical endeavour, meaning that the development, use of (and harm from) AI technologies\\ncan not be separated from their specific social contexts [27, 90].\\nWhen attempting to prevent\\nharm from technologies we must look closely at a new technology’s actual capacities and functions\\nwithin these contexts. An over-emphasis of the role of specific technological features of AI in either\\ncausing, or preventing, discrimination, for example, can obscure other forms of discrimination that\\nare occurring, as well as lead to an unproductive and ultimately distracting focus on fixing or\\nregulating those specific features [33, 90].\\nVeale and Borgesius [109] make a similar argument in regards to the Proposal.\\nThey cite\\nthe examples of the prohibition against releasing AI systems that use subliminal or subconscious\\ntechniques to distort a person’s behaviour and argue that this focus on evocative, ripped from the\\nheadlines potential harms does little to mitigate actual harms and adds little to existing legislation\\n[109]. Issues include, for instance, that prohibition only covers manipulative systems that cause\\nindividual harm but not a collective harm or a harm that arises from dynamics of the user-base\\nentwined with an AI system [109] and that there must be intent to distort behaviour. Dourish\\nand Bell [25] identified a similar phenomenon surrounding the discussion and implementation of\\nubiquitous computing technologies and contrast the myth used to build visions of technologies and\\nthe messiness of the practical implementation of technologies in reality.\\nThey further describe\\nubiquitous computing researchers as explaining away limitations and unexpected consequences of\\nspecific systems by referring to a proximate future where the given technology will be fully realized\\nand highly useful, as soon as a few remaining kinks (such as unevenly distributed infrastructure,\\nfor example) are ironed out [25].\\nIn the case of the messy realities of AI, it is widely acknowledged that it is non-trivial to build\\nerror-free models and good quality data within the context of societal factors and power structures\\nat play [18, 27, 67]. To give a specific example, data workers who are frequently manually labeling,\\ncleaning, and enriching the data used for training AI models, have a crucial role in the development\\nof AI systems and their practices are subject to a myriad of non-objective influences [67]. Similarly,\\nthe harms often identified with AI use online, such as hyper-personalization, invasion of privacy, and\\nspread of hate speech can stem from issues beyond the technology, such as monopolies, data power\\n9\\nimbalances, and un-checked corporate crime [24]. Some have argued that those aspects of online\\nlife are a requisite feature of an emerging economic system that has grown out from the existing\\ncapitalist economic system [117].\\nTherefore, we must acknowledge the systemic sources of the discrimination when mitigating\\ndiscriminatory harm of AI technologies and the discussion of the impact of such technologies should\\nstart at an earlier point. In particular, we must look at the specific setting of a given case. This\\nincludes considering what specific sociopolitical goals a given AI system is enforcing. For example,\\nin Austria, a risk assessment algorithm created for use in the public employment system has been\\ndescribed as guided by a philosophy of neo-liberal austerity in the social sector which has been\\nreplacing the concept of the European welfare state [3]. We must also consider where the discussions\\nare happening, who is involved in the discussions, and how the population is able to discuss and\\nenforce whether an AI in a domain should be used at all. In regards to the Proposal, according\\nto [109], there is evidence of industry influence in high level policy decision-making surrounding the\\ncurrent Proposal.\\nAnother complication in regulating and mitigating harm from AI is the complexity of determin-\\ning how, or if, it is possible to distinguish between AI decisions and human decisions. If we do not\\nacknowledge these entanglements, there is a risk of bias being addressed with overly mechanistic ap-\\nproaches. In reference to the example of privacy ethics, Nissenbaum [75] has described how a focus\\non the very attempt to mitigate privacy concerns by ever more sophisticated anonymization meth-\\nods can lead to overlooking other issues, such as algorithms that do not infringe on privacy, yet are\\nstill harmful. Similarly, a focus on attempting to operationalize a very specific concept of fairness,\\nand to regulate specific methods for monitoring it, risks pulling awareness from other algorithmic\\nharms, or even obfuscating underlying causes of harm [7, 90]. In the case of the Austrian AMS,\\ndescribed above, the controversy of a proposed algorithm opened up a whole discussion about how\\na Public Employment System should be run overall. From the perspective of power aware analysis\\n[67] everyone affected needs to be involved in those decisions.\\n6\\nA Philosophical Perspective on Bias and Fairness in AI\\nWe also look at developments in AI and algorithmic fairness through the lens of moral philosophy,\\nspecifically normative ethics [49], which essentially investigates the question of whether something is\\nmorally right or wrong. There are two major schools of thought in normative ethics; (i) Deontological\\nethics argues the existence and significance of inherent rightness of an action (examples include\\nKant’s categorical imperative [86], and Rawls’ veil of ignorance [93]); (ii) Consequentialism judges the\\nmorality of an action based on the value it brings (examples include welfarism [51], hedonism [70]).\\nWhile our deontological views inform the building blocks of morality in today’s society (e.g., EU\\nfundamental rights), consequential approaches enjoy scalability through the use of representative or\\nproxy metrics in real-world usages (e.g., cost-benefit analysis [58] or per-capita income in economics,\\nand overall accuracy in machine learning as discussed in Section 2). Traditional AI research often\\nfollows a declarative approach where a mathematical objective is designed and optimized while\\ncaring less about the decision-making process and its correctness or representativeness [13, 16,\\n29].\\nSuch an approach can be argued to be a consequentialist’s approach to AI whereby only\\nthe optimization of final objective matters and the end justifies the procedure.\\nHowever, this\\napproach has received a lot of critique within the AI domain, and a range of issues have been pointed\\nout; for example concerning causality [17, 34], fairness [29, 66], explainability [13], including the\\ncomparability and robustness of explanations [88, 89], and trustworthiness [107].\\nAnother angle from which AI developments can be looked at, is Welfarism [51] (a type of\\n10\\nconsequentialism), which suggests choosing the action that maximizes the welfare or well-being of\\nthe population. In fact, it is widely used in some areas of economics, game theory, social-choice\\ntheory, and applications. Welfarism is often studied in two major forms; (i) Utilitarianism [104]\\nemphasizes maximizing the welfare of the population; (ii) Egalitarianism argues for equality often\\nleading to a form of Rawlsian justice [93] which comes under deontological ethics, but its objective\\nform in welfarism tries to maximize the welfare of the worst-off. Utilitarianism is found to be heavily\\nembedded in today’s society. For example, the optimization objectives (loss functions) in machine\\nlearning are often the aggregate errors over the set of data points or the individuals, i.e., utilitarian\\nin nature. Utilitarian social welfare is quite prevalent in economics, computational social choice\\n(allocation, voting, etc.)1. Such utilitarian objectives tend to optimize for the overall utility while\\nmay be best-serving the majority and poorly serving the minority populations. This is one of the\\nreasons due to which the usual loss-minimizing objectives have been found to be unfair in many\\napplications including criminal justice, banking, and gig-economy. On the other hand, egalitarian\\nwelfarism in machine learning would likely try to equalize the errors of all or groups of individuals\\ninstead of minimizing the aggregate errors.\\nIn fact algorithmic fairness notions like individual\\nfairness [26], equal opportunity and equality of odds [40], equal mistreatment [114] are either inspired\\nby or promote egalitarian views in consequential modeling (error represents a consequence). These\\nnotions have been found to reduce the effects of pre-existing biases in data and to improve the\\nutilities of marginalized groups under algorithmic decision-making systems.\\nA few recent works have also explored non-consequential or deontological approaches to algo-\\nrithmic fairness. These works can be grouped into two categories. (1) Works on procedural fairness\\n[30, 32] argue that it is essential for the chosen design and principles to be socially acceptable.\\nThus, these works focus on understanding how people assess fairness and ways to infer societal\\nexpectations about fairness principles thereby accounting for all voices in designing fair decision-\\nmaking systems. For example, Grgić-Hlača et al. [32] propose a framework for procedural fairness\\nby evaluating the moral judgments of humans regarding the use of certain features and accordingly\\ndesigning decision-making systems. (2) Another set of works argue for causal and counterfactual\\nfairness, i.e., addressing unfair causal effects of sensitive attributes in the decision-making process\\n[17, 55]. Instead of focusing on the outcome alone, these works have explored deontological aspects\\nand propose to ensure fairness in the decision-making process.\\n7\\nMapping Perspectives\\nWe have identified three axes along which different perspectives in the debate about AI Regulation\\nfor preventing unfairness and discrimination fall. These axes may or may not be orthogonal, i.e.,\\nthey may have relationships with each other. In the following sections, we define the axes and\\ndescribe the debate surrounding regulating fairness in AI represented by each axis. These are not\\nall of the axes of debate, rather these are salient tensions that we have identified. We find them\\nhelpful in conceptualizing and mapping the values and desiderata of the perspectives we are focusing\\non.\\n7.1\\nAxis-1: Standardization vs. Localization\\n7.1.1\\nThe axis\\nThis first axis of interest addresses the differences between standardization and localization. Stan-\\ndardization entails the process of making things of the same type all have the same basic features\\n1Nash social welfare [50] is an exception.\\n11\\nStandardized\\nLocalized\\nEgalitarian\\nUtilitarian\\nConsequential\\nDeontological\\nSubjective\\nHard to deploy\\nInclusive\\nObjective\\nEasy to deploy\\nMajoritarian\\nFigure 1: Three identified axes along which the debate about regulation of fairness in AI falls; Here they\\nare aligned along high-level characterizations and common arguments made for, and against, each of the\\nends of the identified axes.\\n(from Cambridge dictionary), specifically, through the creation of protocols to guide the design, de-\\nvelopment, and creation of such goods or services based on the consensus of all the relevant parties\\nin the industry. Standardization is intended to ensure that all the goods and services produced\\nrespecting those protocols come with the same or equivalent quality, safety, interoperability and\\ncompatibility. For this reason, multiple parties need to be involved in developing such protocols\\nand standards, namely, manufacturers, sellers, buyers, customers, trade associations, users or regu-\\nlators (https://www.iso.org/standards.html). By contrast, localization describes the process of\\nmaking a product or a service more suitable for a particular country, area, etc. (from Cambridge\\ndictionary). In essence, localization entails adapting the product or service to the characteristics of\\na given culture, region, or society.\\n7.1.2\\nPros and cons\\nIn the context of AI, advocates for and members of industry frequently cite standardization as\\na method for preventing or mitigating discrimination [41, 53, 108]. In this respect, high-risk AI\\nsystems will be presumed to comply with the requirements established in the AI Proposal if they\\nare, as well, in conformity with the harmonised standards published by the Official Journal of the\\nEuropean Union as referred to in article 40 [85]. Likewise, high-risk AI systems in conformity with\\nthe specifications referred to in Article 41 of the AI Proposal will be presumed in conformity with\\nthe regulation [85]. In this sense, conformity with standards and specifications as proposed in the AI\\nRegulation will allow the entry of high-risk AI systems in the European market while guaranteeing\\nagreed levels of quality and safety that ensure the adherence to European principles and values (i.e.,\\nnon-discrimination, fairness, and human dignity).\\nA dilemma regarding standardization, however, appears when there is a disagreement regarding\\nthe standard of fairness that should be used to assess AI systems. As presented in Section 2.1 the\\nstraightforward example of incompatible fairness standards referred to the case of COMPAS and\\nthe different standards followed by ProPublica [5] and Northpoint [22] for their fairness assessments,\\ni.e., disparate mistreatment and calibration respectively [118]. Moreover, overly specific and strict\\n12\\nstandards and frameworks risk encoding a biased, restrictive, non-relevant to everyone, singular\\nworldview, and may ultimately lead to uniformization from a top-down approach Section 4. In\\ntruth, standardarization as a method to enforce fairness can in some cases overlook the root-causes\\nof bias, setting standards and notions of fairness that do not offer a real solution to the intrinsic dis-\\ncrimination or biases in certain situations or contexts Section 5. A—purely hypothetical—example\\nof this problem would be the hard-coded requirements for gender parity in school admissions or\\nhiring where there was a low representation of one of the genders, e.g., due to relocation for work\\nreasons or armed conflicts. The solution would be to establish an acceptable ratio of males to females\\nset at a level appropriate to the local context, rather than a strict gender parity requirement.\\nIn this regard, localizing AI systems entails the process of making them local in character by\\nlimiting the ethics regulation and specifics of enforcement to the desired area. Whereas the complete\\nlocalization of AI systems will be in conflict with the embedded values of the AI Regulation (e.g.,\\nEuropean Common Market and European Fundamental Rights), the localization of some of the\\ndecisions regarding their design, development, or deployment may allow a more tailored approach\\nto address AI discrimination and biases in specific geographical, cultural, or sociotechnical contexts.\\nThe localization of some requirements and technical solutions may, as well, allow for the definition\\nof ethical and legal guidelines that address the specific circumstances of a community, local area, or\\nsector beyond the general standards and specifications.\\n7.2\\nAxis-2: Utilitarian vs. Egalitarian\\n7.2.1\\nThe axis\\nThe second axis of interest addresses differences between utilitarian and egalitarian views. While\\na utilitarian philosophy is one of maximizing the overall welfare of the population, egalitarianism\\naims for equality amongst all those people.\\n7.2.2\\nPros and cons\\nUtilitarianism has long been argued to be in conflict with the certain conceptualizations of fairness\\n(see Chapter 14 of Hooker [45]). In the context of AI, algorithms are often designed to optimize\\nfor certain mathematical objectives (which can be categorized as a declarative approach).\\nThe\\nobjective functions in machine learning tasks usually measure a form of aggregate accuracy over a\\npopulation, which fits the definition of a utilitarian measure. Optimizing solely for such a measure in\\nAI applications risks optimizing the utility of the whole population while hurting minority groups in\\nmany [40, 114]. Utilitarian approaches are so ingrained in the computing research and development\\nmindset that the early group fairness notions—which are supposed to mitigate the discriminatory\\neffects of utilitarian objectives—such as demographic parity, had been reduced to utilitarian forms\\nby constraining over the aggregate benefits or outcomes of groups of individuals [115]. The literature\\nhas now moved on to notions such as individual fairness, equal opportunity, and treatment parity\\nwhich, even though outcome-based, are more egalitarian in nature.\\nDespite its obvious conflicts with fairness, and egalitarianism’s close connection with fairness,\\nutilitarian welfare is often cited a necessary factor in system and policy design. In fact, protecting\\nthe EU’s economic interests is stated as a goal of the AI Act [85]. Since utilitarianism captures\\na certain overall efficiency of a system (accuracy in machine learning, utilitarian welfare in eco-\\nnomics), its goals often reflect business-oriented metrics of AI applications (i.e., click-through rate\\nfor recommendations in online marketplaces, or success-rate of ranked workers on gig-economy plat-\\nforms). However, there might be a trade-offbetween maximizing efficiency and achieving other\\n13\\nsocial objectives like equity or fairness in cases of inherent imbalance in the data or population\\n[9, 10].\\n7.3\\nAxis-3: Consequential vs. Deontological\\n7.3.1\\nThe axis\\nThis third axis of interest from the discussions in Sections 3 to 6 represents the differences between\\nconsequential and deontological ethics. Deontological ethics argue for the existence of the inherent\\nrightness of an action, while consequential ethics evaluate morality based on the consequences of an\\naction.\\n7.3.2\\nPros and cons\\nTechnical measures for mitigating AI based discrimination tend to focus on fairness notions, whereby\\na fairness constraint is often added to the original objective. Fairness in this case is defined by\\nstatistical properties of the outcome/decision of the system (e.g., demographic parity). Fairness\\nnotions thus seek to reduce harm by adjusting or influencing the outcome to fit some statistical\\ndefinition of fairness. While the motivation for doing this may be based on deontological principles\\nof equality, this approach belies a consequentialist definition of fairness, wherein one declares that\\nfairness has been achieved through an equality in outcome, such as equal amount of good (accurate)\\nand bad (inaccurate) outcomes for each group.\\nDeontological ethics is often given as an opposite to consequentialism. A deontological approach\\nargues for the existence and significance of the inherent rightness of an action; in the context of\\nAI based discrimination, this would suggest that the approach described above does not meet the\\ncriteria of acting morally, as the focus is on shifting the outcome. From a deontological perspective,\\nan AI system is unlikely to be fair if the development of AI itself is not driven by essential guiding\\nprinciples, such as fairness.\\nThe Proposal’s prohibition of certain uses is based on deontological principles of protecting\\nfundamental individual rights. However, the risk based approach could be viewed as consequential,\\nin that it only targets systems used in contexts perceived as being highly consequential. This means\\nthat many AI systems which might exhibit harmful representational or discriminatory biases, such\\nas social media and online platforms are relieved of any requirements.\\nSummary: Based on the pattern of high-level characterizations and common arguments made\\nfor, and against, each end of the identified axes, we place them along a single axis, with one end\\ncontaining localized, deontological, egalitarian approaches (LED) and the other end containing\\nstandardized, utilitarian, consequential approaches (SUC); we illustrate this mapping in Figure 1.\\nThe LED end contains approaches that purport to acknowledge systemic and complex causes of\\ndiscrimination and are often criticized as being overly subjective and hard to deploy. The approaches\\non the SUC end purport to be objective and easy to implement while often being critiqued as\\nfailing to recognize systemic causes or ensure inclusion of minority voices. This mapping of the\\nperceived benefits and shortcomings of each approach allows us to identify a key tension in the\\ndebate on regulating fairness in AI. It is one that is based on differing understandings of the nature\\nof bias and discrimination, along with differing priorities as to what constitutes practicality and\\nimplementability in efforts to increase fairness. Following this, we suggest how the Proposal could\\nbetter balance these values, as well as the differing perspectives of stakeholders, to achieve the stated\\ngoal of guaranteeing agreed levels of quality and safety in accordance with European principles and\\n14\\nvalues (i.e., non-discrimination, fairness, and human dignity) without creating major hurdles for the\\nEuropean AI Industry.\\n8\\nKey Agreement and A Path Forward\\n8.1\\nKey Agreement\\nWe see a specific agreement amongst the presented perspectives, regarding limitations of the current\\nregulation. Ultimately each of the perspectives agree that regulation needs to be grounded in the re-\\nality of the context of the use of AI, and that this is not sufficiently achieved in the Proposal. A brief\\nsummary of these previously discussed realities that the Proposal as not sufficiently accounting for\\nis as follows: 1. lack of agreement on what technology like AI really is and what are its capabilities,\\n2. cost and complexity for a business to follow the required regulations, 3. the known limitations\\nof debiasing techniques and explanations of black boxes, 4. lack of specifications on how to best\\nimplement human oversight in the context of AI systems, 5. varied and shifting notions of fairness\\nwithin society, 6. impact of power imbalances (eg. technological divide, data power, company size,\\nand market share) on the creation and enforcement of and ability to comply with the Proposal.\\n8.2\\nA Path Forward: Balancing Perspectives\\n8.2.1\\nStandardization and Localization\\nStandardization may facilitate the translation of fundamental rights, i.e., right to fairness, into\\nstandards and specifications to be followed and complied with by all AI actors with the aim of\\nensuring that AI systems do not discriminate nor mistreat individuals.\\nLikewise, localization may allow the clarification of deontological values in more specific and\\nconcrete requirements, metrics, or assessments, particular to each enforcement context. This is to\\nprevent a top-down enforcement of operationalizations of fairness that are untenable, or even unfair,\\nin some contexts. For example, in Section 4 we have summarized the literature demonstrating that\\nensuring fairness compliance from AI industry could as well be served from a more localized approach\\nto operationalizing fairness. This does not imply the relativization of the legal and ethical principle\\nof fairness but, on the contrary, take into account the wider scenario beyond the purely technical\\nnature of AI and strengthen the enforcement of fairness during the whole life cycle of AI.\\nProposed role of AI Regulation\\nStandardization should be used to the extent that the measure\\nhas a direct link to upholding the deontological value of fairness. In order to ensure the principle\\nof universalization, though, special care must be taken to build in flexible localization allowances.\\n8.2.2\\nUtilitarian and Egalitarian\\nIt may be possible to maintain an egalitarian approach to AI Regulations, while also taking ad-\\nvantage of the potential benefits of utilitarian measures. For example, to promote equality (i.e.,\\nbring in egalitarianism) all stakeholders could be given sufficient power to provide inputs on how\\nto maximize and measure their welfare. Any decisions about utilitarian measures would then be\\nbased on this input. Note that increased awareness of the use of AI systems and their implica-\\ntions toward fairness among the responding individuals (stakeholders) is essential for a successful\\nprocess. This approach would, again, bring up the question of standardization versus localization.\\nSpecifically, how highly localized measures would be required to adequately account for the policy\\nexpectations of all individuals in an egalitarian fashion. To address this, we would defer to the\\n15\\nprinciples suggested in Section 8.2.1. Extensive work is needed to determine how best to implement\\nsuch a process, but some of the open questions may be best left answered by the inclusive input\\nprocess itself.\\nProposed role of AI Regulation\\nThe specific framework for how to obtain and incorporate\\nstakeholder inputs should be laid out. A way needs to be found to enforce that all stakeholders\\nhave sufficient power and influence in AI Regulation decision making processes and that they are\\nthemselves sufficiently aware of the potential adverse implications of AI technology.\\n8.2.3\\nDeontological and Consequential\\nThe EU’s stance on fairness is deontological, in that fairness is justified by itself, with no direct\\nsubordination to its eventual outcomes. What matters is whether the action is motivated by duty\\n(respect of the moral law: dignity and universalization). However, expectations of individuals on the\\nspecifics of what constitutes freedom, equality, and dignity, may vary across cultures, geographies,\\nand contexts. This has led digital and human rights groups to highlight that AI policies should\\nempower individuals, communities, and organisations to contest AI-based systems and to demand\\nredress when they themselves determine that their fundamental rights have been violated [7].\\nThe Proposal itself is not intended to legislate individual rights; that is intended to be covered\\nin other laws of the European legal framework. With that in mind, the Proposal could still enforce\\nan individual’s need to be informed and to understand the impacts. Therefore transparency, ex-\\nplainability of the design, development and implementaion of AI systems, as well as their output,\\nremains paramount. There must also be understandable and effective methods for stakeholders to\\nadjust the specific standards, such as what uses are forbidden, in the case of unforeseen use cases\\nand impacts or of the recognition of previously ignored violations of the European principles.\\nProposed role of AI Regulation\\nRequirements such as documentation and transparency should\\nspecifically serve stakeholders’ needs to understand the implications of AI systems for their specific\\nsituation, life, and work.\\n9\\nConclusion\\nIn this position paper, we presented technical, legal, industrial, sociotechnical, and (moral) philo-\\nsophical perspectives on the debate on fairness in AI systems with a particular focus on the Proposal\\nof the EU AI Act. We identified a pattern of common arguments representing a key tension in the\\ndebate with one side containing deontological, egalitarian, localized approaches and the other side\\ncontaining standardized, utilitarian, consequential approaches. We discussed how different (sym-\\nbolic) ends of the axes could be reconciled and proposed the following roles that the AI Regulation\\ncould take to successfully address these tensions: (1) apply standardization to uphold deontolog-\\nical values, but ensure universalization by including flexible localization allowances; (2) lay out\\na framework to incorporate stakeholder inputs and ensure that they are sufficiently aware of po-\\ntential adverse implications of AI technology; and (3) design requirements of documentation and\\ntransparency so that they serve the needs of stakeholders.\\nAcknowledgements: This work has received funding from the European Union’s Horizon 2020\\nresearch and innovation programme under Marie Sklodowska-Curie Actions (grant agreement num-\\nber 860630) for the project “NoBIAS - Artificial Intelligence without Bias” and (grant agree-\\nment number 860621) for the project “NL4XAI - Natural Language for Explainable AI”. The\\n16\\nauthors would like to thank all the panelists of European AI Regulation Week 2021 (https:\\n//aia-week.pages.citius.usc.es/) for their helpful comments on AI regulation. This work re-\\nflects only the authors’ views and the European Research Executive Agency (REA) is not responsible\\nfor any use that may be made of the information it contains.\\nReferences\\n[1] Angelika Adensamer and Lukas Daniel Klausner. 2021. “Part Man, Part Machine, All Cop”:\\nAutomation in Policing. Frontiers in Artificial Intelligence 4 (2021), 29.\\nhttps://doi.org/\\n10.3389/frai.2021.655486\\n[2] Edward Tristram Albert. 2019. AI in talent acquisition: A review of AI-applications used in\\nrecruitment and selection. Strategic HR Review 18, 5 (2019), 215–221.\\nhttps://doi.org/\\n10.1108/shr-04-2019-0024\\n[3] Doris Allhutter, Florian Cech, Fabian Fischer, Gabriel Grill, and Astrid Mager. 2020. Algo-\\nrithmic profiling of Job Seekers in austria: How austerity politics are made effective. Frontiers\\nin Big Data 3 (2020). https://doi.org/10.3389/fdata.2020.00005\\n[4] Mike Ananny and Kate Crawford. 2018. Seeing without knowing: Limitations of the trans-\\nparency ideal and its application to algorithmic accountability. New Media & Society 20, 3\\n(2018), 973–989. https://doi.org/10.1177/1461444816676645\\n[5] Julia Angwin, JeffLarson, Surya Mattu, and Lauren Kirchner. 2022. Machine Bias*. Ethics\\nof Data and Analytics (2022), 254–264. https://doi.org/10.1201/9781003278290-37\\n[6] Jacqui Ayling and Adriane Chapman. 2021. Putting AI ethics to work: are the tools fit for\\npurpose? AI and Ethics (2021), 1–25. https://doi.org/10.1007/s43681-021-00084-x\\n[7] Agathe Balayan and Seda Gürses. 2021. Beyond Debiasing: Regulating AI and Its Inequalities.\\nTechnical Report. Delft University of Technology.\\n[8] Robert Bartlett, Adair Morse, Richard Stanton, and Nancy Wallace. 2019. Consumer-lending\\ndiscrimination in the FinTech era. Technical Report. National Bureau of Economic Research.\\n[9] Richard Berk et al. 2017.\\nA convex framework for fair regression.\\narXiv preprint\\narXiv:1706.02409 (2017).\\n[10] Dimitris Bertsimas, Vivek F. Farias, and Nikolaos Trichakis. 2012. On the efficiency-fairness\\ntrade-off. Management Science 58, 12 (2012), 2234–2250. https://doi.org/10.1287/mnsc.\\n1120.1549\\n[11] Miranda Bogen and Aaron Rieke. 2018. Help Wanted: An Examination of Hiring Algorithms,\\nEquity, and Bias. Report. Upturn.\\n[12] Anu Bradford. 2020. The Brussels effect: How the European Union rules the world. Oxford\\nUniversity Press, USA.\\n[13] Nadia Burkart and Marco F. Huber. 2021.\\nA survey on the explainability of supervised\\nmachine learning. Journal of Artificial Intelligence Research 70 (2021), 245–317.\\nhttps:\\n//doi.org/10.1613/jair.1.12228\\n17\\n[14] Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automat-\\nically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186.\\nhttps://doi.org/10.1126/science.aal4230\\n[15] Longbing Cao. 2022.\\nAI in finance: Challenges, techniques, and opportunities.\\nComput.\\nSurveys 55, 3 (2022), 1–38. https://doi.org/10.1145/3502289\\n[16] Manuel Carabantes. 2020.\\nBlack-box artificial intelligence:\\nan epistemological and crit-\\nical analysis.\\nAI & SOCIETY\\n35, 2 (2020), 309–317.\\nhttps://doi.org/10.1007/\\ns00146-019-00888-w\\n[17] Daniel C Castro, Ian Walker, and Ben Glocker. 2020.\\nCausality matters in medical\\nimaging.\\nNature Communications 11, 1 (2020), 1–10.\\nhttps://doi.org/10.1038/\\ns41467-020-17478-w\\n[18] Kyla Chasalow and Karen Levy. 2021. Representativeness in Statistics, Politics, and Ma-\\nchine Learning. In Proceedings of the 2021 ACM Conference on Fairness, Accountability,\\nand Transparency. Association for Computing Machinery, New York, NY, USA, 77–89.\\nhttps://doi.org/10.1145/3442188.3445872\\n[19] Henriette Cramer, Jean Garcia-Gathright, Sravana Reddy, Aaron Springer, and Romain\\nTakeo Bouyer. 2019. Translation, tracks & data: an algorithmic bias effort in practice. In\\nExtended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems.\\n1–8. https://doi.org/10.1145/3290607.3299057\\n[20] Alexander D’Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, David Sculley, and\\nYoni Halpern. 2020. Fairness is not static: deeper understanding of long term fairness via\\nsimulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and\\nTransparency. 525–534. https://doi.org/10.1145/3351095.3372878\\n[21] Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, and Krishna P\\nGummadi. 2021. When the umpire is also a player: Bias in private label product recommenda-\\ntions on e-commerce marketplaces. In Proceedings of the 2021 ACM Conference on Fairness,\\nAccountability, and Transparency. 873–884. https://doi.org/10.1145/3442188.3445944\\n[22] William Dieterich, Christina Mendoza, and MS Tim Brennan. 2016. COMPAS Risk Scales:\\nDemonstrating Accuracy Equity and Predictive Parity.\\n[23] Tom Dobber, Ronan Ó Fathaigh, and Frederik Zuiderveen Borgesius. 2019. The regulation of\\nonline political micro-targeting in Europe. Internet Policy Review 8, 4 (2019).\\n[24] Cory Doctorow. 2021. How to Destroy ‘Surveillance Capitalism’. Medium Editions.\\n[25] Paul Dourish and Genevieve Bell. 2011. Divining a Digital Future: Mess and Mythology in\\nUbiquitous Computing. MIT Press, Cambridge, Mass.\\n[26] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012.\\nFairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Sci-\\nence Conference on - ITCS ’12. https://doi.org/10.1145/2090236.2090255\\n[27] M. C. Elish and danah boyd. 2017.\\nSituating methods in the magic of Big Data and ai.\\nCommunication Monographs 85, 1 (2017), 57–80.\\nhttps://doi.org/10.1080/03637751.\\n2017.1375130\\n18\\n[28] Robert Epstein. 2019. Why Google Poses a Serious Threat to Democracy, and How to End\\nThat Threat. America Institute for Behavioral Research and Technology (2019).\\n[29] Jessie Finocchiaro, Roland Maio, Faidra Monachou, Gourab K Patro, Manish Raghavan,\\nAna-Andreea Stoica, and Stratis Tsirtsis. 2021. Bridging Machine Learning and mechanism\\ndesign towards Algorithmic Fairness. Proceedings of the 2021 ACM Conference on Fairness,\\nAccountability, and Transparency. https://doi.org/10.1145/3442188.3445912\\n[30] Ben Green and Yiling Chen. 2019. Disparate interactions: An algorithm-in-the-loop analysis of\\nfairness in risk assessments. In Proceedings of the 2019 Conference on Fairness, Accountability,\\nand Transparency. 90–99. https://doi.org/10.1145/3287560.3287563\\n[31] Graham Greenleaf. 2021. The ‘Brussels Effect’ of the EU’s ‘AI Act’ on Data Privacy Outside\\nEurope. , 3-7 pages. https://papers.ssrn.com/abstract=3898904\\n[32] Nina Grgić-Hlača, Elissa M Redmiles, Krishna P Gummadi, and Adrian Weller. 2018. Human\\nperceptions of fairness in algorithmic decision making: A case study of criminal risk prediction.\\nIn Proceedings of the 2018 World Wide Web Conference - WWW ’18. 903–912.\\nhttps:\\n//doi.org/10.1145/3178876.3186138\\n[33] Nina Grgić-Hlača, Muhammad Bilal Zafar, Krishna P Gummadi, and Adrian Weller. 2018.\\nBeyond distributive fairness in algorithmic decision making: Feature selection for procedurally\\nfair learning. In Thirty-Second AAAI Conference on Artificial Intelligence.\\nhttps://ojs.\\naaai.org/index.php/AAAI/article/view/11296\\n[34] Ruocheng Guo, Lu Cheng, Jundong Li, P Richard Hahn, and Huan Liu. 2020. A survey of\\nlearning causality with data: Problems and methods. ACM Computing Surveys (CSUR) 53,\\n4 (2020), 1–37. https://doi.org/10.1145/3397269\\n[35] Abhishek Gupta, Connor Wright, Marianna Bergamaschi Ganapini, Masa Sweidan, and Ren-\\njie Butalid. 2022.\\nState of AI Ethics Report (Volume 6, February 2022).\\narXiv preprint\\narXiv:2202.07435 (2022).\\n[36] Alisha Haridasani Gupta. 2019. Are Algorithms Sexist? The New York Times (2019).\\n[37] Meeri Haataja and Joanna J. Bryson. 2021. What costs should we expect from the EU’s AI\\nAct? SocArXiv. Center for Open Science.\\n[38] Philipp Hacker. 2018. Teaching fairness to artificial intelligence: Existing and novel strategies\\nagainst algorithmic discrimination under EU law. Common Market Law Review 55, 4 (2018),\\n1143–1185. https://doi.org/10.54648/cola2018095\\n[39] Bernard E. Harcourt. 2007.\\nAgainst Prediction: Profiling, Policing, and Punishing in an\\nActuarial Age. University of Chicago Press. viii, 336 pages.\\n[40] Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learn-\\ning. In Advances in Neural Information Processing Systems, Vol. 29. 3315–3323.\\nhttps://\\nproceedings.neurips.cc/paper/2016/file/9d2682367c3935defcb1f9e247a97c0d-Paper.\\npdf\\n[41] John C. Havens. 2018. Creating the human standard for ethical autonomous and intelligent\\nsystems (A/IS). AI Matters 4 (4 2018), 28–31. Issue 1. https://doi.org/10.1145/3203247.\\n3203255\\n19\\n[42] Thomas T Hills. 2019. The Dark Side of Information Proliferation. Perspectives on Psycho-\\nlogical Science 14 (2019), 323–330. https://doi.org/10.1177/1745691618803647\\n[43] Anna Lauren Hoffmann. 2019. Where fairness fails: data, algorithms, and the limits of an-\\ntidiscrimination discourse. 22, 7 (06 2019), 900–915.\\nhttps://doi.org/10.1080/1369118x.\\n2019.1573912\\n[44] Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wal-\\nlach. 2019. Improving fairness in machine learning systems: What do industry practitioners\\nneed?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.\\n1–16. https://doi.org/10.1145/3290605.3300830\\n[45] Brad Hooker. 2014. Utilitarianism and fairness. Cambridge University Press, 280–302.\\n[46] Dietmar Hübner. 2021. Two kinds of discrimination in AI-based penal decision-making. ACM\\nSIGKDD Explorations Newsletter 23, 1 (2021), 4–13. https://doi.org/10.1145/3468507.\\n3468510\\n[47] Javier Camacho Ibáñez and Mónica Villas Olmeda. 2021. Operationalising AI ethics: How\\nare companies bridging the gap between practice and principles? An exploratory study. (08\\n2021).\\nhttps://doi.org/10.1007/s00146-021-01267-0\\n[48] Information Commissioner’s Office (ICO). 2021. Guide to the General Data Protection Reg-\\nulation (GDPR). https://ico.org.uk/for-organisations/guide-to-data-protection/\\nguide-to-the-general-data-protection-regulation-gdpr/\\n[49] Shelly Kagan. 2018. Normative ethics. Routledge.\\n[50] Mamoru Kaneko and Kenjiro Nakamura. 1979. The Nash Social Welfare function. Economet-\\nrica: Journal of the Econometric Society 47, 2 (1979), 423–435. https://doi.org/10.2307/\\n1914191\\n[51] Simon Keller. 2009. Welfarism. Philosophy Compass 4, 1 (2009), 82–95. https://doi.org/\\n10.1111/j.1747-9991.2008.00196.x\\n[52] Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent Trade-Offs in\\nthe Fair Determination of Risk Scores. In 8th Innovations in Theoretical Computer Science\\nConference (ITCS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 1–23.\\nhttps:\\n//doi.org/10.4230/LIPIcs.ITCS.2017.43\\n[53] Ansgar Koene, Adam Leon Smith, Takashi Egawa, Sukanya Mandalh, and Yohko Hatada.\\n2018.\\nIEEE P70xx, Establishing Standards for Ethical Technology.\\nProceedings of KDD,\\nExCeL London UK (8 2018), 1–2.\\n[54] P. M. Krafft, Meg Young, Michael Katell, Karen Huang, and Ghislain Bugingo. 2019. Defining\\nAI in Policy versus Practice.\\nhttps://papers.ssrn.com/abstract=3431304\\n[55] Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fair-\\nness. In Advances in Neural Information Processing Systems, Vol. 30. 4066–4076.\\nhttps://\\nproceedings.neurips.cc/paper/2017/file/a486cd07e4ac3d270571622f4f316ec5-Paper.\\npdf\\n20\\n[56] Anja Lambrecht and Catherine Tucker. 2019. Algorithmic bias? an empirical study of appar-\\nent gender-based discrimination in the display of STEM career ads. Management Science 65,\\n7 (2019), 2966–2981. https://doi.org/10.1287/mnsc.2018.3093\\n[57] Moritz Laurer, Andrea Renda, and Timothy Yeung. 2021. Clarifying the costs for the EU’s\\nAI Act. Technical Report.\\n[58] Richard Layard and Stephen Gllaister. 1994. Cost-benefit analysis. Cambridge University\\nPress, Cambridge, UK.\\n[59] Michelle Seng Ah Lee and Jatinder Singh. 2021. The landscape and gaps in open source\\nfairness toolkits. Proceedings of the 2021 CHI Conference on Human Factors in Computing\\nSystems. https://doi.org/10.1145/3411764.3445261\\n[60] Mark MacCarthy and Kenneth Propp. 2021. Machines learn that Brussels writes the rules:\\nThe EU’s new AI regulation. Brookings, May 4 (2021), 2021.\\n[61] Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020.\\nCo-Designing Checklists to Understand Organizational Challenges and Opportunities around\\nFairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing\\nSystems. 1–14. https://doi.org/10.1145/3313831.3376445\\n[62] Karima Makhlouf, Sami Zhioua, and Catuscia Palamidessi. 2021.\\nOn the applicability of\\nmachine learning fairness notions. ACM SIGKDD Explorations Newsletter 23, 1 (2021), 14–\\n23. https://doi.org/10.1145/3468507.3468511\\n[63] Karl Manheim and Lyric Kaplan. 2019. Artificial intelligence: Risks to privacy and democracy.\\nYale JL & Tech. 21 (2019), 106. https://ssrn.com/abstract=3273016\\n[64] Ian Manners. 2002. Normative Power Europe: A Contradiction in Terms? 40, 2 (06 2002),\\n235–258.\\n[65] Mark McFadden, Kate Jones, Emily Taylor, and Georgia Osborn. 2021. Harmonising Artificial\\nIntelligence: The Role of Standards in the EU AI Regulation. (2021).\\n[66] Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan.\\n2021. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 54, 6, Article\\n115 (jul 2021), 35 pages.\\nhttps://doi-org.ezbusc.usc.gal/10.1145/3457607\\n[67] Milagros Miceli, Martin Schuessler, and Tianling Yang. 2020. Between Subjectivity and Im-\\nposition: Power Dynamics in Data Annotation for Computer Vision. Proceedings of the ACM\\non Human-Computer Interaction 4, CSCW2 (Oct. 2020), 1–25.\\nhttps://doi.org/10.1145/\\n3415186\\n[68] Shira Mitchell, Eric Potash, Solon Barocas, Alexander D’Amour, and Kristian Lum.\\n2021.\\nPrediction-Based Decisions and Fairness:\\nA Catalogue of Choices,\\nAssump-\\ntions,\\nand\\nDefinitions.\\n8,\\n1\\n(03\\n2021),\\n141–163.\\nhttps://doi.org/10.1146/\\nannurev-statistics-042720-125902 arXiv:1811.07867\\n[69] Brent Mittelstadt. 2019. Principles alone cannot guarantee ethical AI. Nature Machine Intel-\\nligence 1, 11 (11 2019), 501–507.\\nhttps://doi.org/10.1038/s42256-019-0114-4\\n21\\n[70] Andrew Moore. 2013.\\nHedonism.\\nStanford University.\\nhttps://plato.stanford.edu/\\nentries/hedonism/\\n[71] Jessica Morley, Anat Elhalal, Francesca Garcia, Libby Kinsey, Jakob Mökander, and Luciano\\nFloridi. 2021. Ethics as a Service: A Pragmatic Operationalisation of AI Ethics. 31, 2 (2021),\\n239–256.\\nhttps://doi.org/10.1007/s11023-021-09563-w\\n[72] Benjamin Mueller. 2021. How Much Will the Artificial Intelligence Act Cost Europe? Tech-\\nnical Report. Center for Data Innovation.\\n[73] Sendhil Mullainathan. 2018. Algorithmic fairness and the social welfare function. In Pro-\\nceedings of the 2018 ACM Conference on Economics and Computation. 1–1.\\nhttps:\\n//doi.org/10.1145/3219166.3219236\\n[74] Catelijne Muller. 2020. The Impact of Artificial Intelligence on Human Rights, Democracy\\nand the Rule of Law. Technical Report. Council of Europe, Strasbourg.\\n[75] Helen Nissenbaum. 2009. Privacy in Context: Technology, Policy, and the Integrity of Social\\nLife. Stanford University Press.\\nhttps://doi.org/10.1515/9780804772891\\n[76] Eirini Ntoutsi et al. 2020. Bias in data-driven artificial intelligence systems—An introductory\\nsurvey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10 (5 2020).\\nIssue 3. https://doi.org/10.1002/widm.1356\\n[77] Council of the European Union. 2000. Council Directive 2000/78/EC of 27 November 2000\\nestablishing a general framework for equal treatment in employment and occupation.\\n[78] Council of the European Union. 2000.\\nCouncil Directive E 2000/43/EC of 29 June 2000\\nimplementing the principle of equal treatment between persons irrespective of racial or ethnic\\norigin.\\n[79] Council of the European Union. 2004. Council Directive 2004/113/EC of 13 December 2004\\nimplementing the principle of equal treatment between men and women in the access to and\\nsupply of goods and services.\\n[80] Commission of the European Union. 2021.\\nCommunication from the Commission to the\\nEuropean Parliament, the Council, the European Economic and Social Committee and the\\nCommittee of the Regions 2013 Digital Compass: the European way for the Digital Decade.\\n[81] Will Orr and Jenny L Davis. 2020.\\nAttributions of ethical responsibility by Artificial In-\\ntelligence practitioners.\\nInformation, Communication & Society 23, 5 (2020), 719–735.\\nhttps://doi.org/10.1080/1369118x.2020.1713842\\n[82] European Parliament and Council. 2007. Charter of Fundamental Rights of the European\\nUnion.\\n[83] European Parliament and Council of the European Union. 2006. Directive 2006/54/EC Of the\\nEuropean Parliament and of the Council of 5 July 2006 on the implementation of the principle\\nof equal opportunities and equal treatment of men and women in matters of employment and\\noccupation.\\n22\\n[84] European Parliament and Council of the European Union. 2016. Regulation (EU) 2016/679\\non the protection of natural persons with regard to the processing of personal data and on\\nthe free movement of such data, and repealing Directive 95/46/EC (General Data Protection\\nRegulation).\\n[85] European Parliament and Council of the European Union. 2021. Proposal for a Regulation\\nof the European Parliament and of the Council laying down harmonised rules on Artificial\\nIntelligence (Artificial Intelligence Act) and amending certain Union legislative acts.\\n[86] Herbert James Paton. 1971. The categorical imperative: A study in Kant’s moral philosophy.\\nVol. 1023. University of Pennsylvania Press.\\n[87] Gourab K Patro, Arpita Biswas, Niloy Ganguly, Krishna P Gummadi, and Abhijnan\\nChakraborty. 2020. FairRec: Two-sided fairness for personalized recommendations in two-\\nsided platforms. In Proceedings of The Web Conference 2020. 1194–1204. https://doi.org/\\n10.1145/3366423.3380196\\n[88] Martin Pawelczyk, Sascha Bielawski, Johannes van den Heuvel, Tobias Richter, and Gjergji\\nKasneci. 2021. CARLA: A Python Library to Benchmark Algorithmic Recourse and Counter-\\nfactual Explanation Algorithms. In Proceedings of the Neural Information Processing Systems\\nTrack on Datasets and Benchmarks 2021.\\n[89] Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. 2020. On Counterfactual Ex-\\nplanations under Predictive Multiplicity. In Proceedings of the Thirty-Sixth Conference on\\nUncertainty in Artificial Intelligence, UAI 2020 (Proceedings of Machine Learning Research,\\nVol. 124). AUAI Press, 809–818.\\n[90] Seeta Peña Gangadharan and Jędrzej Niklas. 2019. Decentering Technology in Discourse on\\nDiscrimination. Information, Communication & Society 22, 7 (June 2019), 882–899.\\nhttps:\\n//doi.org/10.1080/1369118X.2019.1593484\\n[91] Inioluwa Deborah Raji et al. 2020. Closing the AI accountability gap: Defining an end-to-end\\nframework for internal algorithmic auditing. In Proceedings of the 2020 conference on fairness,\\naccountability, and transparency. 33–44. https://doi.org/10.1145/3351095.3372873\\n[92] Bogdana Rakova, Jingying Yang, Henriette Cramer, and Rumman Chowdhury. 2021. Where\\nresponsible AI meets reality: Practitioner perspectives on enablers for shifting organizational\\npractices. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–23.\\nhttps://doi.org/10.1145/3449081\\n[93] John Rawls. 2009. A theory of justice. Harvard University Press, Cambridge, Mass.\\n[94] Richard M. Re and Alicia Solow-Niederman. 2019. Developing artificially intelligent justice.\\nStan. Tech. L. Rev. 22 (2019), 242. https://ssrn.com/abstract=3390854\\n[95] Andrea Renda et al. 2021. Study to support an impact assessment of regulatory requirements\\nfor artificial intelligence in Europe. Technical Report. European Commission - Directorate-\\nGeneral for Communications Networks, Content and Technology.\\n[96] Filipe N. Ribeiro et al. 2018. Media Bias Monitor : Quantifying Biases of Social Media News\\nOutlets at Large-Scale. In Twelfth International AAAI Conference on Web and Social Media.\\nAAAI Press, Palo Alto, California, 290–299.\\nhttps://aaai.org/ocs/index.php/ICWSM/\\nICWSM18/paper/view/17878\\n23\\n[97] Brianna Richardson and Juan E. Gilbert. 2021. A Framework for Fairness: A Systematic\\nReview of Existing Fair AI Solutions. (12 2021). arXiv:2112.05700 http://arxiv.org/abs/\\n2112.05700\\n[98] Boris Ruf and Marcin Detyniecki. 2021. Towards the Right Kind of Fairness in AI. (09 2021).\\narXiv:2102.08453 [cs] http://arxiv.org/abs/2102.08453\\n[99] Ajay Sandhu and Peter Fussey. 2021. The ‘uberization of policing’? How police negotiate\\nand operationalise predictive policing technology. Policing and Society 31, 1 (2021), 66–81.\\nhttps://doi.org/10.1080/10439463.2020.1803315\\n[100] Cristian Santesteban and Shayne Longpre. 2020. How big data confers market power to Big\\nTech: Leveraging the perspective of data science. The Antitrust Bulletin 65, 3 (2020), 459–485.\\nhttps://doi.org/10.1177/0003603x20934212\\n[101] Laura Sartori and Andreas Theodorou. 2022. A sociotechnical perspective for the future of AI:\\nnarratives, inequalities, and human control. Ethics and Information Technology 24, 1 (2022),\\n1–11. https://doi.org/10.1007/s10676-022-09624-3\\n[102] Daniel S. Schiff, Kelly Laas, Justin B. Biddle, and Jason Borenstein. 2022. Global AI Ethics\\nDocuments: What They Reveal About Motivations, Practices, and Policies. In Codes of Ethics\\nand Ethical Guidelines: Emerging Technologies, Changing Fields. Springer International Pub-\\nlishing, 121–143.\\nhttps://doi.org/10.1007/978-3-030-86201-5_7\\n[103] Andrew D. Selbst, danah boyd, Sorelle A. Friedler, Suresh Venkatasubramanian, and Janet\\nVertesi. 2019.\\nFairness and Abstraction in Sociotechnical Systems. In Proceedings of the\\nConference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* ’19).\\nAssociation for Computing Machinery, New York, NY, USA, 59–68.\\nhttps://doi.org/10.\\n1145/3287560.3287598\\n[104] Amartya Sen. 1979. Utilitarianism and welfarism. The journal of Philosophy 76, 9 (1979),\\n463–489. https://doi.org/10.2307/2025934\\n[105] Josh\\nSimons\\nand\\nDipayan\\nGhosh.\\n2022.\\nUtilities\\nfor\\ndemoc-\\nracy:\\nWhy\\nand\\nhow\\nthe\\nAlgorithmic\\nInfrastructure\\nof\\nFacebook\\nand\\nGoogle\\nmust\\nbe\\nregulated.\\nhttps://www.brookings.edu/research/\\nutilities-for-democracy-why-and-how-the-algorithmic-infrastructure-of-facebook-and-google-\\n[106] Taylor Telford. 2019.\\nApple Card algorithm sparks gender bias allegations against\\nGoldman\\nSachs.\\nhttps://www.washingtonpost.com/business/2019/11/11/\\napple-card-algorithm-sparks-gender-bias-allegations-against-goldman-sachs/\\n[107] Ehsan Toreini et al. 2020. The relationship between trust in AI and trustworthy machine\\nlearning technologies. In Proceedings of the 2020 conference on fairness, accountability, and\\ntransparency. 272–283. https://doi.org/10.1145/3351095.3372834\\n[108] Antje Von Ungern-Sternberg et al. 2022.\\nDiscriminatory AI and the Law– Legal stan-\\ndards for algorithmic profiling. In The Cambridge Handbook of Responsible Artificial Intel-\\nligence: Interdisciplinary Perspectives (Cambridge Law Handbooks). Cambridge University\\nPress. https://ssrn.com/abstract=3876657\\n24\\n[109] Michael Veale and Frederik Zuiderveen Borgesius. 2021. Demystifying the Draft EU Artificial\\nIntelligence Act — Analysing the Good, the Bad, and the Unclear Elements of the Proposed\\nApproach. Computer Law Review International 22, 4 (Aug. 2021), 97–112.\\nhttps://doi.\\norg/doi:10.9785/cri-2021-220402\\n[110] Michael Veale, Max Van Kleek, and Reuben Binns. 2018. Fairness and accountability design\\nneeds for algorithmic support in high-stakes public sector decision-making. In Proceedings of\\nthe 2018 CHI Conference on Human Factors in Computing Systems. 1–14.\\nhttps://doi.\\norg/10.1145/3173574.3174014\\n[111] Sahil Verma and Julia Rubin. 2018. Fairness Definitions Explained. In Proceedings of the Inter-\\nnational Workshop on Software Fairness (New York, NY, USA) (FairWare ’18). Association\\nfor Computing Machinery, 1–7.\\nhttps://doi.org/10.1145/3194770.3194776\\n[112] Julius von Kügelgen, Amir-Hossein Karimi, Umang Bhatt, Isabel Valera, Adrian Weller,\\nand Bernhard Schölkopf. 2021.\\nOn the Fairness of Causal Algorithmic Recourse.\\narXiv:2010.06529 [cs, stat] http://arxiv.org/abs/2010.06529\\n[113] Raphaële Xenidis and Linda Senden. 2019. EU non-discrimination law in the era of artificial\\nintelligence: Mapping the challenges of algorithmic discrimination. In Ulf Bernitz et al (eds),\\nGeneral Principles of EU law and the EU Digital Order. Kluwer Law International, 2020,\\n151–182. https://ssrn.com/abstract=3529524\\n[114] Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi.\\n2017. Fairness beyond disparate treatment & disparate impact: Learning classification without\\ndisparate mistreatment. In Proceedings of the 26th international conference on world wide web.\\n1171–1180. https://doi.org/10.1145/3038912.3052660\\n[115] Muhammad Bilal Zafar, Isabel Valera, Manuel Rodriguez, Krishna Gummadi, and Adrian\\nWeller. 2017. From parity to preference-based notions of fairness in classification. In Advances\\nin Neural Information Processing Systems. 229–239.\\n[116] Aleš Završnik. 2020. Criminal justice, artificial intelligence systems, and human rights. In\\nERA Forum, Vol. 20. Springer, 567–583. https://doi.org/10.1007/s12027-020-00602-0\\n[117] Shoshana Zuboff. 2019. Surveillance Capitalism and the Challenge of Collective Action. New\\nLabor Forum 28, 1 (Jan. 2019), 10–29.\\nhttps://doi.org/10.1177/1095796018819461\\n[118] Frederik Zuiderveen Borgesius. 2018. Discrimination, artificial intelligence, and algorithmic\\ndecision-making. Technical Report. Strasbourg Council of Europe.\\n25\\n\\n\\n\\n---\\n\\n\\nConnecting the Dots in Trustworthy Artificial Intelligence: From AI\\nPrinciples, Ethics, and Key Requirements to Responsible AI Systems\\nand Regulation\\nNatalia Díaz-Rodrígueza,∗, Javier Del Serb,c,∗, Mark Coeckelberghd, Marcos López de Pradoe,f,g,\\nEnrique Herrera-Viedmaa and Francisco Herreraa\\naDepartment of Computer Science and Artificial Intelligence, DaSCI Andalusian Institute in Data Science and Computational Intelligence, University of\\nGranada, Granada 18071, Spain\\nbTECNALIA, Basque Research and Technology Alliance (BRTA), 48160 Derio, Spain\\ncDepartment of Communications Engineering, University of the Basque Country (UPV/EHU), 48013 Bilbao, Spain\\ndDepartment of Philosophy, University of Vienna, Vienna, 1010, Austria\\neSchool of Engineering, Cornell University, Ithaca, NY, 14850, United States\\nfADIA Lab, Al Maryah Island, Abu Dhabi, United Arab Emirates\\ngDepartment of Mathematics, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates\\nA R T I C L E I N F O\\nKeywords:\\nTrustworthy AI\\nAI Ethics\\nResponsible AI systems\\nAI Regulation\\nRegulatory Sandbox\\nA B S T R A C T\\nTrustworthy Artificial Intelligence (AI) is based on seven technical requirements sustained over three\\nmain pillars that should be met throughout the system’s entire life cycle: it should be (1) lawful,\\n(2) ethical, and (3) robust, both from a technical and a social perspective. However, attaining truly\\ntrustworthy AI concerns a wider vision that comprises the trustworthiness of all processes and actors\\nthat are part of the system’s life cycle, and considers previous aspects from different lenses. A more\\nholistic vision contemplates four essential axes: the global principles for ethical use and development\\nof AI-based systems, a philosophical take on AI ethics, a risk-based approach to AI regulation, and the\\nmentioned pillars and requirements. The seven requirements (human agency and oversight; robustness\\nand safety; privacy and data governance; transparency; diversity, non-discrimination and fairness;\\nsocietal and environmental wellbeing; and accountability) are analyzed from a triple perspective:\\nWhat each requirement for trustworthy AI is, Why it is needed, and How each requirement can be\\nimplemented in practice. On the other hand, a practical approach to implement trustworthy AI systems\\nallows defining the concept of responsibility of AI-based systems facing the law, through a given\\nauditing process. Therefore, a responsible AI system is the resulting notion we introduce in this work,\\nand a concept of utmost necessity that can be realized through auditing processes, subject to the\\nchallenges posed by the use of regulatory sandboxes. Our multidisciplinary vision of trustworthy AI\\nculminates in a debate on the diverging views published lately about the future of AI. Our reflections\\nin this matter conclude that regulation is a key for reaching a consensus among these views, and that\\ntrustworthy and responsible AI systems will be crucial for the present and future of our society.\\n1. Introduction\\nWe are witnessing an unprecedented upsurge of Artifi-\\ncial Intelligence (AI) systems. Despite its important histori-\\ncal development, in the last years AI has vigorously entered\\nall professional and social domains of applications, from\\nautomation to healthcare, education and beyond. Recently, a\\nfamily of generative AI (DALL-E1 [1], Imagen2 [2] or large\\nlanguage model products such as ChatGPT3) have sparked\\na significant amount of debates. These arise as a concern on\\nwhat this could mean in all fields of application and what\\nimpact they could have.\\nThe views expressed in this article are the authors’, and are not\\nnecessarily the views of the institutions they are affiliated with.\\n∗These authors contributed equally. Corresponding authors: natalia-\\ndiaz@ugr.es (N. Díaz-Rodríguez), javier.delser@tecnalia.com (J. Del Ser).\\n1DALL-E, https://openai.com/product/dall-e-2, accessed on April\\n25th, 2023.\\n2Google Imagen, https://imagen.research.google/, accessed on April\\n25th, 2023.\\n3Chat Generative Pre-Trained Transformer (Chat GPT), https://chat\\n.openai.com/chat, accessed on April 25th, 2023.\\nThese debates, active for years now, pose questions re-\\ngarding the ethical aspects and requirements that AI systems\\nmust comply with. They emerge from the ethical principles’\\nperspective, from the regulation ones, from what it means to\\nhave fair AI, or from the technological point of view, on what\\nan ethical development and use of AI systems really mean.\\nThe notion of trustworthy AI has attracted particular interest\\nacross the political institutions of the European Union (EU).\\nThe EU has intensively worked on elaborating this concept\\nthrough a set of guidelines based on ethical principles and\\nrequirements for trustworthy AI [3].\\nTrustworthy AI is a holistic and systemic approach that\\nacts as prerequisite for people and societies to develop,\\ndeploy and use AI systems [3]. It is composed of three pillars\\nand seven requirements: the legal, ethical, and technical\\nrobustness pillars; and the following requirements: human\\nagency and oversight; technical robustness and safety; pri-\\nvacy and data governance; transparency; diversity, non-\\ndiscrimination and fairness; societal and environmental\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 1 of 30\\narXiv:2305.02231v2 [cs.CY] 12 Jun 2023\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nFrom Trustworthy AI to \\nResponsible AI Systems\\nPrinciples for ethical use \\nand development of AI \\n(Section 2)\\nArtificial Intelligence regulation: \\nA risk-based approach\\n(Section 4)\\nA philosophical approach \\nto AI ethics\\n(Section 3)\\nPillars and \\nRequirements of \\nTrustworthy AI\\n(Section 5)\\nHuman \\nagency & \\noversight\\nTechnical \\nrobustness & \\nsafety\\nPrivacy & data \\ngovernance\\nTransparency\\nDiversity, non-\\ndiscrimination \\n& fairness\\nSocietal & \\nenvironmental \\nwellbeing\\nAccountability\\nRobustness\\nLawfulness\\nEthics\\nFigure 1: Our holistic approach to attain responsible AI systems from trustworthy AI breaks down trustworthy AI into 4 critical\\naxes: assuring the principles for ethical development and use of AI, philosophically reflecting on AI ethics, complying with AI\\nregulation, and fulfilling Trustworthy AI requirements.\\nwellbeing; and accountability. Although the previous def-\\ninition is based on requirements, there is a larger multidi-\\nmensional vision. It considers the ethical debate per se, the\\nethical principles and a risk-based approach to regulation,\\nbacked up by the EU AI Act [4].\\nThe goal of this paper is to become a primer for re-\\nsearchers and practitioners interested in a holistic vision of\\ntrustworthy AI from 4 axes (Fig. 1): from ethical principles\\nand AI ethics, to legislation and technical requirements. Ac-\\ncording to this vision, our analysis tackles the main aspects\\nof trustworthy AI in a non-exhaustive but technical fashion,\\nby:\\n• Providing a holistic vision of the multifaceted notion of\\ntrustworthy AI that considers its diverse principles for eth-\\nical use and development of AI, seen from international\\nagencies, governments and the industry.\\n• Breaking down this multidimensional vision of trustwor-\\nthy AI into 4 axes, to reveal the intricacies associated to\\nits pillars, its technical and legal requirements, and what\\nresponsibility in this context really means.\\n• Examining requirements for trustworthy AI, addressing\\nwhat each requirement actually means, why it is necessary\\nand proposed, and how it is being addressed technologi-\\ncally. While this paper is not intended to be an exhaustive\\nreview, we will delve into an overview of technical possi-\\nbilities to address the aforementioned seven key require-\\nments for trustworthy AI.\\n• Analyzing AI regulation from a pragmatic perspective to\\nunderstand the essentials of the most advanced legal piece\\nexisting so far, the European Commission perspective,\\nand to fully grasp its practical applicability.\\n• Defining responsible AI systems as the result of connect-\\ning the many-sided aspects of trustworthy AI above. This\\nis the notion we advocate for, in order to truly attain trust-\\nworthy AI. Their design should be guided by regulatory\\nsandboxes.\\n• Dissecting currently hot debates on the status of AI,\\nthe moratorium letter to pause giant AI experiments, the\\ncurrent movements around an international regulation and\\nour positioning based on the previous analyses.\\nBy bridging the gap from theory (AI Principles, Ethics,\\nand Key Requirements) to practice (Responsible AI Systems\\nand Regulation), our holistic view offered in this work aims\\nto ultimately highlight the importance of all these elements\\nin the development and integration of human-centered AI-\\nbased systems into the everyday life of humans, in a natural\\nand sustainable way.\\nThe paper is organized as follows: Section 2 revises the\\nmost widely recognized AI principles for the ethical use\\nand development of AI (axis 1). Section 3 considers axis\\n2: a philosophical approach to AI ethics. Section 4 (axis 3)\\npresents the current risk-based viewpoint to AI regulation.\\nSection 5 analyzes axis 4, i.e., key requirements to imple-\\nment trustworthy AI from a technical point of view. Section 6\\ndiscusses the practical applicability of trustworthy AI by first\\nclosing the loop through the necessary definition of respon-\\nsible AI systems, and then exposing the requirements for\\nhigh-risk AI systems to comply with the law. It also proposes\\nthe use of regulatory sandboxes as a challenge to test the\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 2 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nformer requirements via auditing, and a practical blueprint\\nas a case study for AI healthcare. We end this section by\\nelaborating on the needs of emerging AI systems (including\\ngeneral purpose AI and neuroscience technology), which de-\\nmand evolved evaluation protocols and dynamic regulation.\\nSection 7 follows by dissecting currently hot debates on the\\nstatus of AI, from the AI moratorium letter to regulation\\nas the key for consensus, including a reflection on the gap\\nto be closed between regulation and technological progress.\\nFinally, Section 8 draws concluding remarks, and highlights\\nthe aforementioned convergence between AI technology and\\nregulation as the beacon for research efforts that safely bring\\nthe benefits of this technology to humanity.\\n2. Principles for ethical use and development\\nof Artificial Intelligence\\nA large set of declarations and guidelines for the ethical\\nuse and development of AI has bloomed. These declarations\\nlead to different similar approaches for introducing sets of\\nprinciples as a departure point for discussing about the\\nresponsible development of AI.\\nIn this section we will analyze three different principle\\ndeclarations. We will start in Section 2.1 with the general\\nUNESCO Recommendation on the Ethics of Artificial In-\\ntelligence, and continue in Section 2.2 taking a practical\\nperspective from the industry point of view through the prin-\\nciples of Responsible AI by Design in Practice by Telefónica,\\na global telecommunication and media company with large\\npresence in the Spanish and Portuguese markets. Finally,\\nin Section 2.3 we analyze the ethical principles based on\\nfundamental rights associated to the European approach.\\n2.1. UNESCO recommendation on the ethics of\\nArtificial Intelligence\\nIn November 2021, UNESCO proposed in Paris a Rec-\\nommendation on the Ethics of AI. Recognizing that AI can\\nbe of great service to humanity and all countries can benefit\\nfrom it, but also can raise fundamental ethical concerns (can\\ndeepen existing divides and inequities in the world), and\\naccounting for the Universal Declaration of Human Rights\\n(1948) and the rest of relevant international recommenda-\\ntions and declarations, the UNESCO Recommendation on\\nthe Ethics of Artificial Intelligence, [5] are the following:\\n1. Proportionality and do no harm: AI methods should\\nnot infringe upon the foundational values in these rec-\\nommendations, should be based on rigorous scientific\\nfoundations, and final human determination should apply.\\n2. Safety and security: Unwanted harms such as safety risks,\\nand vulnerabilities to attacks (security risks) should be\\navoided and addressed throughout the life cycle of AI\\nsystems.\\n3. Fairness and non-discrimination: AI actors should pro-\\nmote social justice and safeguard fairness. Member States\\nshould tackle digital divides ensuring inclusive access\\nand equity, and participation in the development of AI.\\n4. Sustainability: The continuous assessment of the human,\\nsocial, cultural, economic and environmental impact of\\nAI technology should be carried out with “full cog-\\nnizance of the implications of AI for sustainability” as\\na set of constantly evolving goals.\\n5. Right to Privacy, and Data Protection: Privacy must be\\nrespected, protected and promoted throughout the AI life\\ncycle.\\n6. Human oversight and determination: Member States\\nshould ensure that it is always possible to attribute ethical\\nand legal responsibility for any stage of the life cycle of\\nAI systems, as well as in cases of remedy related to AI\\nsystems, to physical persons or existing legal entities.\\n7. Transparency and explainability: Transparency is nec-\\nessary for relevant liability regimes to work effectively.\\nAI actors should commit to ensuring that the algorithms\\ndeveloped are explainable, especially in cases that im-\\npact the end user in a way that is not temporary, easily\\nreversible or otherwise low risk.\\n8. Responsibility and accountability: “The ethical responsi-\\nbility and liability for the decisions and auctions based\\nin any way on an AI system should always ultimately be\\nattributable to AI actors”\\n9. Awareness and literacy: Public awareness and under-\\nstanding of AI technologies and the value of data should\\nbe promoted through open and accessible education, civic\\nengagement, digital skills and AI ethics training. All\\nsociety should be able to take informed decisions about\\ntheir use of AI systems and be protected from undue\\ninfluence.\\n10. Multi-stakeholder and adaptive governance and collabo-\\nration: “Participation of different stakeholders through-\\nout the AI system life cycle is necessary for inclusive\\napproaches to AI governance, enabling the benefit to be\\nshared by all, and to contribute to sustainable develop-\\nment”.\\nThe proposed principles are accompanied by values to\\npromote, e.g., human rights and fundamental freedoms. Val-\\nues and principles are designed to be respected by all actors\\ninvolved in the AI system life cycle, being amenable of\\nchange through amendments to existing and new legislation\\nand business guidelines, since they must comply with inter-\\nnational law, the United Nations Charter and Member States.\\n2.2. Telefónica’s principles of Responsible AI by\\nDesign in Practice\\nEnterprises also need to cope with and adapt to new\\ndemands of AI products and associated risks. The previous\\nrecommendations are also aligned with the more generic\\nprinciples for AI defined by the Berkman Klein Center\\nfor Internet & Society at Harvard University that started\\nbeing developed since 2016: Principled AI maps ethical and\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 3 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nrights-based approaches to principles for AI to address is-\\nsues related to the potential threats of AI to both individuals\\nand society as a whole. Derived from these, in industry, e.g.,\\nTelefónica defines the so-called 5 principles of Responsible\\nAI by Design in Practice [6] as:\\n1. Fair AI: the output of AI systems must not lead to\\ndiscrimination.\\n2. Transparent and explainable AI: people should know\\nwhether they are communicating with a person or an AI-\\nbased system.\\n3. Human-centered AI (AI for Social Good, Human-centered\\nAI [7]): AI products and services must be aligned with the\\nUN Sustainable Development Goals.\\n4. Privacy and security by design: standards should be con-\\nsidered during all life cycles, also from the Responsible\\nResearch and Innovation Guidelines [8].\\n5. Extend them to any third party.\\nThe adoption of these and similar principles is part of\\nnew awareness strategies being carried out in companies,\\nsometimes known as change management. Telefónica’s ap-\\nproach is only one example of such adoption. This implies\\na change in organizations culture to take into account and\\nimplement these principles on a day-to-day basis.\\n2.3. Ethical principles based on fundamental\\nrights\\nIn Europe, the foundations of trustworthy AI adhere\\nto the four ethical principles proposed by the European\\nCommission’s High-Level Expert Group (HLEG) [3]. These\\nare based on fundamental rights, to which AI practitioners\\nshould always strive to adhere, in order to ensure the devel-\\nopment, deployment and use of AI systems in a trustworthy\\nway. Trustworthy AI is grounded in fundamental rights and\\nreflected by the European Commission’s Ethical Principles:\\n1. Respect for human autonomy: Ensure freedom and au-\\ntonomy of humans interacting with AI systems implies\\nhumans should keep full and effective self-determination\\nover themselves and the ability to take part on democratic\\nprocesses; AI systems should not \\\"unjustifiably subor-\\ndinate, coerce, deceive, manipulate, condition or herd\\nhumans, but rather, argument, complement and empower\\nhuman cognitive, social and cultural skills, leave oppor-\\ntunity for human choice and securing human oversight\\nover work processes\\\" in AI systems, e.g., support humans\\nin the work environment and support the creation of\\nmeaningful work.\\n2. Prevention of harm4: AI systems should not “cause nor\\nexacerbate harm or adversely affect humans”. AI sys-\\ntems should “protect human dignity, mental and physical\\nintegrity, be technically robust and assure they are not\\n4Harm can be individual or collective, can include intangible harm to\\nsocial, cultural, political or natural environments and all living beings.\\nopen to malicious use”. For instance, they should be\\nsupervised so they do not exacerbate adverse impacts due\\nto information asymmetries or unequal balance of power.\\n3. Fairness: Fairness is closely related to the rights to Non-\\ndiscrimination, Solidarity and Justice. Although there are\\nmany different interpretations of fairness, the European\\nCommission advocates for having both: a) a substantive\\ndimension of fairness that \\\"commits to ensure equal and\\njust distribution of benefits and costs, commits to free\\nfrom unfair bias, discrimination and stigmatization, im-\\nplies respecting the principle of proportionality between\\nmeans and ends and a careful balancing of competing\\ninterests and objectives\\\" [3]. b) a procedural dimension\\nallowing to \\\"contest and seek redress against decisions\\ntaken by AI systems or who operates them\\\". To achieve\\nthis, the entity responsible for the decision must be iden-\\ntifiable, while decision making processes should be ex-\\nplainable.\\n4. Explainability: Being crucial for building and maintain-\\ning users trust in the AI system, explainability means\\nthat processes need to be transparent, the capabilities\\nand purpose of AI systems openly communicated, and\\ndecision -to the extent possible- explainable to those\\ndirectly and indirectly affected. When a decision cannot\\nbe duly contested (e.g., because explaining a particular\\nmodel output or decision and what combination of input\\nfactors contributed to it is not always possible), then other\\nexplainability measures may be required (traceability,\\nauditability and transparent communication on the capa-\\nbilities of the AI system). This will depend on the context\\nand severity of consequences if an output is erroneous.\\nThese ethical principles are placed in the context of AI\\nsystems. They act as ethical imperatives, and advocate for\\nAI systems to strive to improve individual and collective\\nwellbeing.\\nAs we can see, the mobilization has been worldwide:\\nfrom the Montréal Declaration for a responsible develop-\\nment of AI – an initiative of University of Montréal–, to the\\nEthics of AI recommendations led by international organ-\\nisations such as UNESCO, passing by the adoption led by\\nindustry. All sets of principles share terminology, common\\ngrounds on human rights, and agree on the relevance of\\npreserving human decisions and responsibilities, which are\\nthe most prominent features of ethics of AI.\\n3. A philosophical approach to Artificial\\nIntelligence ethics\\nEthics is an academic discipline which is a subfield of\\nphilosophy and generally deals with questions such as “What\\nis a good action?”, “What is the value of a human life?”,\\n“What is justice?”, or “What is the good life?” [3].\\nAligned with the European Commission ethics guide-\\nlines [3], our ethical vision of AI consists of five main actions\\n[9]. These can help smooth the way to attain ethical AI.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 4 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nNext, we develop these, taking a philosophical approach to\\nAI ethics:\\n1. Use philosophy and science to examine and critically\\ndiscuss assumptions around the role that AI and humans\\nplay in these scenarios and discussions. For example, one\\ncould critically discuss claims that are made about the\\npossibility of Artificial General Intelligence or human-\\nlevel AI. Large language models, for instance, may give\\nthe impression that they have a human-like level of intel-\\nligence, but work very differently than the human brain\\nand make many mistakes that humans would not make.\\nThis also leads to the question regarding the differences\\nbetween humans and machine, and is also linked to the\\nquestion concerning the moral status of AI. For example,\\nit has been claimed that a chatbot was sentient, while it\\ndid not meet the criteria for sentience. That being said, it\\nis not always clear what these criteria are. AI makes us\\nre-visit philosophical questions concerning moral status.\\n2. Observe attentively the nature of AI and which functions\\nit is assigned to perform today within its diversity of\\napplications. We should recognize the pervasiveness of\\nAI. One reason why it is important to ask ethical ques-\\ntions about AI is that it is pervasive: it is used in many\\napplications such as search, text generation, recommen-\\ndations for commercial products, and so on. In the ethical\\nanalysis, we need to pay attention to the details of each\\napplication\\n3. Discuss the most concrete and pressing ethical and social\\nproblems that AI presents in terms of how it is being\\napplied today. AI raises a number of ethical questions\\nsuch as privacy and data protection, safety, responsibility,\\nand explainability. For example, a chatbot can encour-\\nage someone to take their life. Does this mean that the\\napplication is unsafe? How can we deal with this risk?\\nAnd if something happens, who is responsible? Typically,\\nthere are many people involved in technological action.\\nIt is also important to be answerable to those who are\\naffected by the technology [10], for example in the case\\nof a suicide5 the company may need to be answerable to\\nthe family of the victim. Furthermore, it is important that\\nwhen AI offers recommendations for decisions, it is clear\\non what basis these recommendations and decisions are\\ntaken. One problem is that this is usually not clear in the\\ncase of deep learning. In addition, there are societal im-\\nplications such as potential unemployment caused by the\\nautomation that is enabled by AI, and the environmental\\ncosts of AI and its infrastructures through energy use and\\ncarbon emissions linked to the use of the algorithms, the\\nstorage of data, and the production of hardware.\\n4. Investigate AI policies for the near future. There are\\nnow already many policy documents on AI, for example\\nthe Ethics Guidelines for Trustworthy AI produced by\\nthe European Commission’s High-Level Expert Group\\n5https://coeckelbergh.medium.com/chatbots-can-kill-d82fde5cf6ca\\non AI [3] and the Recommendation on the Ethics of\\nArtificial Intelligence [5]. These documents need to be\\ncritically investigated. For example, in the beginning, less\\nattention was given to environmental consequences of AI.\\nA more general problem is that principles and lists of\\nethical considerations are not sufficient; there is still a\\ngap between those principles and implementation in the\\ntechnology, in standards, and in legal regulation.\\n5. Ask ourselves whether the attention that concentrates\\nthe public discourse in AI is useful as we face other\\nproblems, and whether AI should be our unique focus of\\nattention. Given that we also face other global problems\\nsuch as climate change and poverty, it is important to\\nask the question regarding prioritization: Is AI the most\\nimportant problem we face? And if not - if, for instance,\\nwe insist on climate change being the main and most\\nurgent global problem - how does AI impact and perhaps\\ncontribute to this problem, and how can it perhaps help to\\nsolve it? Reflection on these challenges will be important\\nin the coming years.\\nOnce expressed the ethics of AI from the philosophical\\nperspective, the next section will analyze the regulation of\\nAI.\\n4. Artificial Intelligence regulation: A\\nrisk-based approach\\nIn the currently hot debate of AI, a fundamental aspect\\nis regulating AI for it to be righteous. The most advanced\\nregulation to date is the European Commission’s AI Act\\nproposal6 for the regulation of AI [4].\\nIn this section we are presenting AI regulation from two\\nangles; first in Section 4.1 from the perspective of risk of AI\\nsystems and then, in Section 4.2, we make a deeper analysis\\ninto high-risk AI systems.\\n4.1. A risk-based approach to regulate the use of\\nArtificial Intelligence systems\\nThe AI Act draft proposal for a Regulation of the Euro-\\npean Parliament and of the Council laying down harmonized\\nrules on AI [4] is the first attempt to enact a horizontal\\nAI regulation. The proposed legal framework focuses on\\nthe specific use of AI systems. The European Commission\\nproposes to establish a technology-neutral definition of AI\\nsystems in EU legislation and defines a classification for AI\\nsystems with different requirements and obligations tailored\\nto a “risk-based approach”, where the obligations for an AI\\nsystem are proportionate to the level of risk that it poses.\\nThe rules of the AI Act specifically consider the risks\\ncreated by AI applications by proposing a list of high-risk\\napplications, setting clear requirements for AI systems for\\nhigh-risk applications, defining specific obligations for AI\\n6On April 27th, 2023, the Members of European Parliament (MEPs)\\nreached a political agreement on the AI Act, https://www.euractiv.com/sec\\ntion/artificial-intelligence/news/meps-seal-the-deal-on-artificial-i\\nntelligence-act/, accessed on May 1st, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 5 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nLevel 1: \\nApplications with\\nunacceptable risk\\nLevel 2: \\nApplications\\nwith high risk\\nLevel 3: \\nApplications\\nwith limited risk\\nLevel 4: \\nApplications\\nwithout risk or\\nwith minimal risk\\nVideogames\\nPurchase\\nrecommendation\\nIndustrial \\nprognosis\\nChatbots\\nDeepfakes\\nEmotion\\nrecognition\\nAutonomous\\nVehicle Safety\\nLaw\\nenforcement\\nSocial & education\\napproval procedures\\nMedical \\ndiagnosis\\nFacial recognition\\nSocial scoring\\nSubliminal\\n/obscure manipulation\\nFigure 2: AI Act criticality pyramid and risk-based approach regulatory system for the use of algorithmic systems; SS stands for\\nsubsequent articles (figure extended from the EU Portal8 and inspired from [11] and [12]).\\nusers and providers of high risk applications, proposing\\na conformity assessment before the AI system is put into\\nservice or placed on the market, proposing enforcement\\nafter it is placed in the market, and proposing a governance\\nstructure at European and national levels.\\nThe four levels of risk [4] outlined by the AI Act are the\\nfollowing (Figure 2):\\n• Minimal or No risk: The vast majority of AI systems\\ncurrently used in the EU fall into this category. The\\nproposal allows the free use of minimal-risk AI. Volun-\\ntarily, systems providers of those systems may choose to\\napply the requirements for trustworthy AI and adhere to\\nvoluntary codes of conduct (Art. 69 - Codes of Conduct)7.\\nWhen a compliant AI systems presents a risk, the relevant\\noperator will be required to take measures to ensure the\\nsystem no longer presents that risk, withdraw the system\\nfrom market, or recall the risk for a reasonable period\\ncommensurate with the nature of the risk (Art. 67 -\\nCompliant AI systems which present a risk). For instance:\\nAI-enabled video games or spam filters.\\n• Limited risk: Systems such that users should be aware\\nthat they are interacting with a machine so they can take\\nan informed decision to continue or step back. These\\nhave to comply with specific information/transparency\\nobligations; for instance, chatbots, and systems generating\\ndeepfakes or synthetic content.\\n7Codes of conduct are encouraged by the Commission and the Member\\nStates to foster the voluntary application to AI systems other than high-risk\\nAI systems (HRAIs) “on the basis of technical specification and solutions\\nthat are appropriate means of ensuring compliance with such requirements\\nin light of the intended purpose of the systems” (Art. 69).\\n8Regulatory framework proposal on Artificial Intelligence, https:\\n//digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai,\\naccessed on April 25th, 2023.\\n• High-risk AI systems (HRAIs): Systems that can have a\\nsignificant impact on the life chances of a user (Art. 6);\\nthey create an adverse impact on people’s safety or their\\nfundamental rights9. Eight types of systems fall into this\\ncategory; these are subject to stringent obligations and\\nmust undergo conformity assessments before being put on\\nthe European market, e.g. systems for law enforcement or\\naccess to education. They will always be high-risk when\\nsubject to third-party conformity assessment under that\\nsectorial legislation.\\n• Unacceptable risk: AI systems considered a clear threat\\nto the safety, livelihoods and rights of people will be pro-\\nhibited in the EU market (Title II — Prohibited Artificial\\nIntelligence Practices, Art. 5). For example: Social scor-\\ning, facial recognition, dark-patterns and manipulative AI\\nsystems, e.g., voice assistance systems that encourage\\ndangerous behavior, or real time remote biometric iden-\\ntification systems in public spaces for law enforcement.\\nAs we can see, very differently to the Chinese, government-\\ncentric, approach, or the US industry-owned-data approach\\nto AI, the EU is taking a human-centric approach to regulate\\nthe use of AI. This risk scenario-based approach regulates\\nusages rather than models and technology themselves. This\\nis the position we defend.\\nSince the beginning of 2023, the European Parliament\\nhas been considering amendments to the law proposing how\\nto conduct fundamental rights impact assessments and other\\nobligations for users of HRAIs. Issues still to be finalized\\ninclude closing the list of HRAI scenarios above exposed,\\n9As protected by the EU Charter of Fundamental Rights (source: https:\\n//ec.europa.eu/commission/presscorner/detail/en/QANDA_21_1683).\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 6 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nprohibited practices, and details concerning the use of copy-\\nrighted content to train AI systems10 and the regulation of\\ngeneral purpose AI systems (GPAIS)11. The first consider-\\nation indicates the EU Parliament will force AI operators\\nto reveal whether they use protected content. The second\\nrequest emerges from the Privacy and Access Council of\\nCanada, who agrees that GPAIS carry serious risks and\\nharmful unintended consequences, and must not be exempt\\nunder the EU AI Act, or equivalent legislation elsewhere.\\nA recent definition of GPAIS can be found in [13]: “An AI\\nsystem that can accomplish a range of distinct valuable tasks,\\nincluding some for which it was not specifically trained”.\\nIt has also been referred to as foundation model [14, JRC\\nGlossary, pag. 32]), but really a GPAIS refers to a model\\nof different nature, beyond the generative AI or foundation\\nmodels that can be considered as specific cases of GPAI\\nsystems. A point of agreement among all definitions to date\\nis the capability of a GPAIS to accomplish tasks beyond\\nthose for which it was originally trained. This is one of the\\nmain reasons why GPAIS have become a pivotal topic of\\ndebate in what refers to AI regulation. Section 6.5 will delve\\nfurther into this.\\n4.2. High-risk Artificial Intelligence systems\\nThe European AI Act is predicted to become the global\\nstandard for AI regulation12 by unifying within a single\\nframework the concept of risk acceptability and the trust-\\nworthiness of AI systems by their users [15]. The risk-based\\napproach of the AI Act specifically categorizes as HRAIs\\nthe following eight kind of AI systems (AI Act, Annex III -\\nHigh-risk AI systems referred to in Art. 6(2) [4]):\\n1. Surveillance systems (e.g., biometric identification and\\nfacial recognition systems for law enforcement)\\n2. Systems intended for use as security components in the\\nmanagement and operation of critical digital infrastruc-\\ntures (road traffic and water, gas, heat and electricity\\nsupply).\\n3. Systems to determine access, admission or assignment\\nof people to educational institutions or programs or to\\nevaluate people (for the purpose of evaluating learning\\noutcomes, learning processes or educational programs).\\n4. Systems intended to be used for recruitment or selection\\nof personnel, screening or filtering of applications and\\nevaluation of candidates, or systems for making decisions\\n10Financial Times, European parliament prepares tough measures over\\nuse of AI, https://www.ft.com/content/addb5a77-9ad0-4fea-8ffb-8e2ae250a\\n95a?shareType=nongift, accessed on April 25th, 2023.\\n11Privacy and Access Council of Canada, Five considerations to guide\\nthe regulation of “General Purpose AI”, https://pacc-ccap.ca/five-consi\\nderations-to-guide-the-regulation-of-general-purpose-ai/, accessed on\\nApril 25th, 2023.\\n12The EU AI Act’s Risk-Based Approach: High-Risk Systems and What\\nThey Mean for Users, https://futurium.ec.europa.eu/en/european-ai-all\\niance/document/eu-ai-acts-risk-based-approach-high-risk-systems-and\\n-what-they-mean-users, accessed on April 25th, 2023.\\non promotion and termination of contractual relation-\\nships, assignment of tasks based on individual behavior\\nand the evaluation of performance and behavior.\\n5. Systems for assessing the eligibility for public benefits\\nor assistance, assessing creditworthiness or establish-\\ning credit scores. Systems for dispatching or prioritizing\\nemergency first response services (firefighters, medical\\nfirst aid, etc.).\\n6. Systems to assess the risk of a person committing crime\\nor recidivism, or the risk that he or she is a potential\\noffender.\\n• Systems intended for use as polygraphs or to detect\\nemotional state, or to assess the reliability of evidence\\nin the course of an investigation or prosecution of\\ncrime.\\n• Systems for predicting the occurrence or re-occurrence\\nof crimes based on profiles of people or assessment of\\npersonality traits and characteristics or past criminal\\nbehavior.\\n• Systems for profiling individuals by law enforcement\\nauthorities in the course of detecting, investigating or\\nprosecuting crimes.\\n7. Systems intended for use by competent public authorities\\n(such as polygraphs or to detect the emotional state of\\nindividuals):\\n• Risk assessment systems, including security risks, ir-\\nregular immigration or health risk posed by a person\\nseeking to enter a member state.\\n• Systems for the examination of applications for asy-\\nlum, visas and residence permits and claims associated\\nwith the eligibility of people applying for status.\\n8. Systems intended for the administration of justice and\\ndemocratic processes (intended to act on behalf of the\\nauthorities in the administration of justice for the inter-\\npretation of acts or law and the application of the law\\nto a particular set of facts, or evaluation of reliability of\\nevidence).\\nOne fact worth noting in the AI Act is its special em-\\nphasis on the importance of taking into account, when\\nclassifying AI systems, the result of the AI system in relation\\nwith the decision or action taken by a human, as well as the\\nimmediacy of its effect (AI Act Intro, (32) [4]).\\n5. Trustworthy Artificial Intelligence: Pillars\\nand Requirements\\nIn a technical sense, trustworthiness is the confidence of\\nwhether a system/model will act as intended when facing a\\ngiven problem [16]. This confidence generates trust in the\\nuser of the model (the audience), which can be supported\\nfrom multiple perspectives. For instance, trust can be fos-\\ntered when a system provides detailed explanations of its\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 7 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\ndecisions [17]. As Lipton puts it, a person can be more\\nconfident when using a model if he/she understands how it\\nworks and how it produces its decisions [18]. Likewise, trust\\ncan be bolstered if the user is offered guarantees that the\\nmodel can operate robustly under different circumstances,\\nthat it respects privacy, or that it does not get affected by\\nbiases present in the data from which it learns.\\nTrustworthiness is, therefore, a multifaceted requisite for\\npeople and societies to develop, deploy and use AI systems,\\nand a sine qua non condition for the realization of the\\npotentially vast social and economic benefits AI can bring\\n[3]. Moreover, trustworthy does not concern only the system\\nitself, but also other actors and processes that take their part\\nduring the AI life cycle. This requires a holistic and systemic\\nanalysis of the pillars and requirements that contribute to the\\ngeneration of trust in the user of an AI-based system.\\nThis section addresses this need by first dissecting the\\nthree pillars that set the basis for trustworthy AI – namely,\\nlawfulness, ethics and robustness (Subsection 5.1) – fol-\\nlowed by a thorough analysis of the seven requirements\\nproposed by the European Commission’s High-Level Ex-\\npert Group (HLEG): human agency and oversight (Sub-\\nsection 5.2); technical robustness and safety (Subsection\\n5.3); privacy and data governance (Subsection 5.4); Trans-\\nparency (Subsection 5.5); diversity, non-discrimination and\\nfairness (Subsection 5.6); societal and environmental well-\\nbeing (Subsection 5.7); and accountability (Subsection 5.8).\\nDefinitions (what does the requirement stand for?), motiva-\\ntions (why is the requirement relevant for trustworthiness?)\\nand a short glimpse at methodologies (how can the require-\\nment be met in AI-based systems?) will be given for each of\\nthese requirements in their respective sections.\\n5.1. The three pillars of trustworthy Artificial\\nIntelligence\\nIn general, a pillar can be understood as a fundamental\\ntruth of a given idea or concept, from which key require-\\nments to realize the idea can be formulated. Similarly to\\nconstruction engineering, pillars are essential for building\\nup the concept of trustworthy AI: each pillar is necessary\\nbut not sufficient on its own to achieve trustworthy AI. Key\\nrequirements can contribute to one or several pillars, just\\nlike construction elements such as concrete, formwork or\\ncantilevers are used to help pillars support the structure\\nof the building. These requirements must be continuously\\nensured throughout the entire life cycle of AI systems,\\nthrough methodologies that must not only be technical, but\\nalso involve human interaction.\\nAccording to the EU Ethical Guidelines for Trustworthy\\nAI [3], pillars of trustworthy AI systems are defined as three\\nbasic properties that such systems should possess:\\n• Pillar 1: Lawful. Trustworthy AI systems should comply\\nwith applicable laws and regulations, both horizontally\\n(i.e. the European General Data Protection Regulation)\\nand vertically (namely, domain-specific rules that are\\nimposed in certain high-risk application domains, such as\\nmedical or finance).\\n• Pillar 2: Ethical. Besides their compliance with the law,\\ntrustworthy AI systems should also adhere to ethical prin-\\nciples and values. The rapid technological development\\nof current AI-based system rises ethical questions that\\nare not always addressed synchronously by regulatory\\nefforts. The democratized usage of large language models\\nand misinformation using deepfakes are two avant-garde\\nexponents of the relevance of Ethics as one of the pillars\\nof trustworthy AI.\\n• Pillar 3: Robust. Trustworthy AI systems should guarantee\\nthat they will not cause any unintentional harm, working\\nin a safe and reliable fashion from both technical (perfor-\\nmance, confidence) and social (usage, context) perspec-\\ntives.\\nTrustworthy AI stands on these three pillars. Ideally, they\\nshould act in harmony and pushing in synergistic directions\\ntowards the realization of trustworthy AI. However, tensions\\nmay arise between them: for instance, what is legal is not\\nalways ethical. Conversely, ethical issues may require the\\nimposition of law amendments that become in conflict with\\nprevalent regulations. Trustworthy AI must guarantee ethi-\\ncal principles and values, obey the laws, and operate robustly\\nso as to attain its expected impact on the socioeconomic\\nenvironment in which it is applied.\\nThe above three pillars lie at the heart of the HLEG\\nguidelines [3], which establish the seven requirements for\\ntrustworthy AI. As depicted in Figure 3, each requirement\\nspans several components or dimensions in which the re-\\nquirement becomes of special relevance for the design and\\noperation of an AI-based system. Apart from recommending\\ntechnical and non technical methods, the guidelines also\\ninclude an Assessment List for Trustworthy Artificial Intel-\\nligence, ALTAI, for self-assessment of AI systems and for\\nmaking the 7 requirements operative [19].\\nThe next subsections elaborate in detail on these require-\\nments:\\n5.2. Requirement 1: Human agency and oversight\\nWHAT does it mean? AI systems should empower human\\nbeings, allowing them to make informed decisions and fos-\\ntering their fundamental rights. At the same time, proper\\noversight mechanisms need to be ensured, which can be\\nachieved through human-in-the-loop, human-on-the-loop,\\nand human-in-command approaches. In other words, AI-\\nbased systems must support human autonomy and decision\\nmaking.\\nWHY is it important for trustworthiness? This require-\\nment is necessary for autonomy and control. The unfair\\nmanipulation, deception, herding and conditioning of AI-\\nbased systems can be a threat to the individual autonomy,\\nrights and freedom of their users. Therefore, trustworthy AI\\nsystems should provide the means for the user to supervise,\\nevaluate and freely adopt/override a decision made by such\\nsystems, avoiding decisions that are automatically made\\nwithout humans being involved in the process.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 8 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nEthics\\nTrustworthy \\nArtificial Intelligence\\nLawfulness\\nRobustness\\nAccountability\\nSocietal and \\nenvironmental \\nwellbeing\\nDiversity, \\nnon-discrimina-\\ntion & fairness\\nTransparency\\nPrivacy and data \\ngovernance\\nTechnical \\nrobustness \\nand safety\\nHuman agency \\nand oversight\\nRequirement 5\\nRequirement 4\\nRequirement 3\\nRequirement 2\\nRequirement 1\\nRequirement 7\\nRequirement 6\\nSubsection 5.8\\nSubsection 5.7\\nSubsection 5.6\\nSubsection 5.5\\nSubsection 5.4\\nSubsection 5.3\\nSubsection 5.2\\nFigure 3: Pillars and requirements of Trustworthy AI [3].\\nHOW can this requirement be met in practice? Two\\ndimensions underlie this first requirement, namely, human\\nagency (Subsection 5.2.1) and human oversight (Subsection\\n5.2.2). We now analyze different methodological approaches\\nthat can be adopted in these two dimensions:\\n5.2.1. Human agency\\nMechanisms for human oversight will depend on the\\narea of application and potential risk. For the preservation\\nof human rights, human-compatible [20], human-centric\\napproaches [21, 22, 14], AI for social good [23, 7] and\\nhuman computation or interactive machine learning [24]\\nare computing paradigms aligned with this requirement.\\nHowever, more structured toolkits (along the lines of [25]\\nor C-Suite [26]) will need to be materialized for a smooth\\ndomain-specific consideration of this requirement. In terms\\nof technical tools to reach different audiences, language\\nappears as the universal means of communication among\\nhumans and machines, and thus, AI models using natural\\nlanguage processing and/or counterfactual and natural lan-\\nguage explanations [27] will be relevant to help humans\\nsupervise and take the most appropriate decision based on\\nthe output of AI systems.\\n5.2.2. Human oversight\\nDifferent degrees of human involvement in the supervi-\\nsion of AI-based systems can be specified [3]:\\n• Human-in-the-loop (HITL), which refers to the ability of\\nthe supervisor to intervene in every decision cycle of the\\nsystem being monitored [24].\\n• Human-on-the-loop (HOTL), which stands for human\\nintervention during the design and monitoring cycles of\\nthe AI-based system.\\n• Human-in-command (HIC), namely, the capability of the\\nsupervisor to oversee the overall activity of the AI system\\nincluding its broader economic, societal, legal and ethical\\nimpacts, and ensuring that decisions produced by the AI\\nsystem can be overridden by the human.\\nDepending on the application under consideration, mech-\\nanisms supporting one of the above levels of human over-\\nsight can be designed. Methods proposed so far are largely\\ndomain-specific, since user-algorithm interfaces vary de-\\npending on the capabilities and background of the supervisor\\nand the design of the AI-based solution.\\n5.3. Requirement 2: Technical robustness and\\nsafety\\nWHAT does it mean? This second requirement includes\\nseveral functionalities all aligned with the prevention of un-\\nintentional harm and the minimization of the consequences\\nof intentional harm. These include the resilience of AI-based\\nsystems (to attacks and security), ensuring fallback plans\\n(in case something goes wrong), general safety, and being\\naccurate, reliable and reproducible. Robustness and safety\\nrefer to the need of AI systems to be secure, reliable and\\nrobust enough to errors and/or inconsistencies in all phases\\nof the life cycle [28].\\nWHY is it important for trustworthiness? AI-based sys-\\ntems deployed on real-world scenarios can undergo changes\\nin their operating environment that can induce changes at\\ntheir inputs (e.g. concept drift). Likewise, such changes can\\nbe the result of the interaction of malicious users with the\\nAI-based system in an adversarial fashion. Disregarding\\nwhether such changes are intentional or not, the trustworthi-\\nness of the AI-based system is subject to the capability of the\\nmodel to mitigate the impact of these changes in their issued\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 9 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\npredictions. Likewise, in risk-critical applications trustwor-\\nthy AI systems should evaluate relevant safety measures\\nand endowed with functionalities to fall back when the AI-\\nbased system deviates from its expected behavior as per the\\nmonitored measures. Finally, reliability and reproducibility\\nconnects tightly with trustworthiness in what refers to the\\nverification of the expected operation and performance of\\nAI-based systems. When AI-based systems are to be used\\nin different contexts and deployed in different systems, such\\ncomponents are vital to ensure that the system at hand\\nresiliently accommodates the differences and particularities\\nthat may arise in each context/system, ultimately working as\\nexpected.\\nHOW can this requirement be met in practice? Method-\\nologies that can be explored to support this requirement\\ncan be analyzed over three dimensions: technical robustness\\n(Subsection 5.3.1), safety (Subsection 5.3.2) and repro-\\nducibility (Subsection 5.3.3).\\n5.3.1. Technical robustness\\nWhen dealing with an AI-based system, robustness and\\nreliability are properties that refer to the ability of the sys-\\ntem to have comparable performance on atypical data with\\nrespect to typical operational regimes [29]. Robustness can\\nbe established in the face of different circumstances: when\\nwe hope a model to be robust, it is due to the fact that\\nthe model may degrade, be perturbed or affected during its\\nfuture usage. It is desirable to have a model that is robust in\\nterms of its generalization or generative capabilities, against\\nadversarial attacks or models, or against data perturbations.\\nSystematically, several levels of robustness can be dis-\\ntinguished in AI-based systems [30]:\\n• Level 0 (no robustness or standard training): this first\\nlevel of robustness refers to the one provided by the\\nAI-based system by itself, without any risk mitigation\\nfunctionalities or additions added to its design. This level\\nconcerns generalization capabilities such as being robust\\nto distributional changes caused by spurious features or\\ndata instances. Despite the lack of specific risk mitigation\\nmeasures, some naive information provided by certain\\nnaive AI models (e.g. quantification of epistemic con-\\nfidence) can be exploited to detect when the AI-based\\nsystem is not working in its expected operational regime.\\n• Level 1 (generalization under distribution shifts): this\\nsecond level of robustness considers techniques aimed to\\nmitigate different types of changes in data. Data changes\\nor shifts include covariate shift, prior probability shift,\\nconcept drift and confounding shift, depending on the\\nchange happening in the distribution of the input vari-\\nables, the output of the model, the statistical relationship\\nbetween the inputs and outputs, or the change of a variable\\nthat influences both inputs and outputs, respectively [31].\\nIn this level we can also place the generalized frame-\\nwork of Out-of-Distribution (OoD) detection [32], which\\nrefers to anomaly detection, novelty detection and open\\nset recognition, the latter referring to the capability of\\nthe model to detect, characterize and incorporate new\\nunknown patterns to its knowledge base (e.g. new classes\\nin a classification problem). Level 1 of robustness against\\nthese data shifts can be approached by concept drift detec-\\ntion and adaptation techniques, OoD detection methods or\\nclass-incremental learning schemes, to mention a few.\\n• Level 2 (robustness against a single risk): this third worst-\\ncase robustness tackles a single point of risk, e.g., the\\npresence of adversarial examples. Assessing this level\\nrequires model inspection and intervention (e.g., active\\nmodel scanning, probing to find failure cases, adversarial\\ndefenses against different attack modes).\\n• Level 3 (robustness against multiple risks): It extends the\\nformer to multiple risks (e.g., common data corruptions,\\nspurious correlations).\\n• Level 4 (universal robustness): this level is reached by AI-\\nbased systems that are proven to be effectively robust to\\nall known risks.\\n• Level 5 (human-aligned and augmented robustness): it\\nfurthers complements level 4 by aligning human-centered\\nvalues and user feedback, automatically augmenting exist-\\ning robustness demands as per the requirements, context\\nand usage of the AI-based system. This level should be\\ntargeted by high-risk AI-powered applications.\\nThe robustness of the AI-system system should be a\\ncore part of the risk management strategy adopted by the\\nowner of the system itself, hence becoming a core part\\nof their accountability procedures. Indeed, AI maintenance\\nframeworks should ease achieving robustness and AI status\\ntracking and control through the AI life cycle [33]. Mon-\\nitoring can be produced either passively (by periodically\\nmeasuring different quantitative metrics related to robust-\\nness over the data, model, or both) or actively (emulating\\nthe circumstances under which the robustness of the model\\ncan be thought to be compromised (e.g. emulated adversarial\\nattack instances or perturbations of known samples). In\\nboth cases, AI maintenance frameworks can detect model\\ndegradation through time by detecting systematic deviations\\nof the aforementioned metrics in data and models [34].\\nInterestingly, areas currently under study in AI research aim\\nin this direction, endowing AI-based systems with the ability\\nto learn continually from infinite streams of varying data\\n[35], to quantify and communicate their confidence in their\\noutputs [36], or to characterize and consolidate new patterns\\narising from data over time [37].\\nWe end the discussion about how technical robustness\\ncan be supported in AI-based systems by highlighting the\\npotential that techniques used to address other requirements\\ncan bring to technical robustness. For instance, explainabil-\\nity techniques can help make models more robust, since they\\ncan show which features are more stable to out of distribution\\nchanges in the input or adversarial attacks. Likewise, the\\nintensity of changes needed to reach a target adversarial con-\\nfidence score in counterfactual generation can be a reliable\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 10 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nestimator of the extent to which a certain data instance can\\nbe considered to be out of distribution [38]. All in all, these\\nexamples are a few among the multiple cases in which a\\nfunctionality added to an AI-based system can simultane-\\nously contribute to several requirements for trustworthiness.\\n5.3.2. Safety\\nEvolving from a generic Information Technologies con-\\ntext, safety in AI [39, 40, 41] is developing in relation to the\\nalignment with human values. In this sense, concrete proto-\\ncols and procedures are challenging to define, but necessary\\nfor AI safety. Safety in AI concerns several unsolved research\\nissues [40], including:\\n• Attaining robustness as the objective of withstanding haz-\\nards, and building systems less vulnerable to adversarial\\nthreats such as adversarial perturbations which cause high\\nconfidence mistakes, and robust to long tails.\\n• Facilitating tools to inspect AI-based systems, identify\\nhazards and anomalies, calibrate them, identify honest\\noutputs, and detect emergent capabilities. One risk of\\nAI systems that links with the need for safety tools is\\nthat they may carry backdoors [42]: backdoored models\\nbehave correctly in nearly all scenarios, except in chosen\\nscenarios taught to behave incorrectly due to the training\\non poisoned data as a way to have backdoors injected.\\nThese are problematic, specially in foundational models\\nthat serve as the architectural backbone of downstream\\nmodels, all evolved from originally poisoned data from\\nmassive training datasets [40].\\n• Defining safety objectives in order to steer models, ei-\\nther internally (how models should learn to guarantee\\ncompliance with safety metrics) and externally (how such\\nsafety compliance can be safely pursued). Problems in this\\nregard include:\\n– Value learning, as the inability of AI systems to code\\nhuman values (e.g., happiness, sustainability, mean-\\ningful experiences or safe outcomes). Although giving\\nopen-world inputs to models can partially tell apart\\npleasant and unpleasant states, utility values of such\\nstates are no ground truth values, and are a result of\\nthe model’s own learned utility function [43].\\n– Proxy gaming: This is a phenomenon due to the fact that\\noptimizers and adversaries can manipulate objective\\nproxies. As Goodhart’s law states, a measure ceases\\nto be a reliable indicator when it becomes the target.\\nFor instance, proxy gaming occurs as reward hacking in\\nreinforcement learning. Similarly, objective countable\\nmetrics end up substituting human values when opaque\\nAI models are forced to learn by optimizing a single\\nquantitative measure13. Therefore, merely acquiring a\\nproxy for human values is insufficient: models must\\nalso be resilient to solvers seeking to manipulate it.\\n13These are also known as weapons of math destruction [44] that may\\ncontain pernicious feedback loops that perpetuate stereotypes and biases\\n[45] if they do not consider context nor a concrete person’s features, but\\nrather those of its neighbors.\\n5.3.3. Reproducibility\\nOnce robustness and safety have been addressed, an\\nimportant dimension in this key requirement for trustworthy\\nAI is reproducibility. It can be defined as the ability of AI\\nexperiments to exhibit the same behavior when repeated\\nunder the same conditions. Reproducibility is related to\\nreplicability, which refers to the capability to independently\\nachieve non-identical conclusions that are at least similar\\nwhile differences in sampling, research procedures and data\\nanalyses may exist [14]. Since both concepts are essential\\nparts of the scientific method, the National Information Stan-\\ndards Organization (NISO) and the Association for Comput-\\ning Machinery (ACM) redefine these concepts as:\\n• Repeatability (same team, same experimental setup),\\nwhich means that an individual or a team of individuals\\ncan reliably repeat his/her/their own experiment.\\n• Replicability (different team, same experimental setup):\\nan independent group of individuals can obtain the same\\nresult using artifacts which they independently develop in\\ntheir entirety.\\n• Reproducibility (different team, different experimental\\nsetup with stated precision): a different independent group\\ncan obtain the same result using their own artifacts.\\nIt should be clear that when formulated in the context\\nof trustworthy AI systems, one should regard an experiment\\nin the above definitions as the performance, robustness and\\nsafety evaluation of a given AI-based system. This evalua-\\ntion can be done by different groups (as in research) or by\\na certification lab (as in commercial software-based solu-\\ntions). The extent to which reproducibility can be guaranteed\\nin trustworthy AI systems depends on the confidentiality\\nof the system or the singularity of the experimental setup\\nfor which the AI-based system was developed, among other\\nconstraining circumstances. For instance, in mild contexts\\n(as in research), reproducibility of experiments by third\\nparties is often favored by public releases of the source code\\nimplementing the AI-based system being proposed.\\n5.4. Requirement 3: Privacy and data governance\\nWHAT does it mean? This requirements assures the re-\\nspect for privacy and data protection thorough the AI system\\nlife cyle (design, training, testing, deployment and oper-\\nation), adequate data governance mechanisms taking into\\naccount the quality and integrity of the data and its relevance\\nto the domain, and also ensures legitimized access to data\\nand processing protocols.\\nWHY is it important for trustworthiness? AI systems\\nbased on digital records of human behavior can be capa-\\nble of inferring individual preferences and reveal personal\\nsensitive information such as the sexual orientation, age,\\ngender, religious or political views. Since AI-based systems\\nlearn from data, systems must guarantee that such personal\\ninformation is not revealed while data is processed, stored\\nand retrieved throughout the AI life cycle, facilitating means\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 11 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nto trace how data is used (governance) and verifying that\\nprotected information is not accessed (privacy awareness)\\nduring the life cycle phases. If such guarantees are not pro-\\nvided, AI-based systems will not be trusted by end users, nor\\nwill they conform to existing legislation (e.g. the European\\nGDPR). Citizens should have full control over their data,\\nand their data will not be unlawfully or unfairly used to\\nharm or discriminate against them [28]. This requirement\\nis important to preserve human rights such as the right\\nto privacy, intimacy, dignity or the right to be forgotten.\\nKeeping the usage and scope of the data limited, protected\\nand informed is paramount, since digital information can\\nbe used towards clustering a person into profiles that may\\nnot reflect reality, while reinforcing stereotypes, historical\\ndifferences among minorities, or perpetuate historical or\\ncultural biases [44].\\nHOW can this requirement be met in practice? In the\\nfollowing we analyze technologies that can maintain data\\nprivacy in AI-based systems (Subsection 5.4.1), and strate-\\ngies to deal with data governance as quality and integrity\\nprocesses (Subsection 5.4.2).\\n5.4.1. Data privacy\\nIn order to land down the data privacy requirement into\\nactual technologies, we emphasize the relevance of Feder-\\nated learning (FL) [46, 47], homomorphic computing [48]\\nand differential privacy (DP) [49] as examples of privacy-\\naware technologies in the current AI landscape:\\n• In FL, a model is trained across multiple decentralized\\ndevices without moving the data to a central location. In\\ndoing so, instead of delivering all the data to a central\\nserver, devices learn models locally using their own data,\\nso that only numerical model updates are sent to the\\ncentral server. The central server aggregates the updated\\nmodel parameters from all the devices or servers to create\\na new model. This allows learning a global model lever-\\naging all data in situations where the data is sensitive.\\nBesides preserving the privacy of local data, FL can\\nreduce communication costs and accelerate the model\\ntraining.\\n• In homomorphic computing, data can be processed in\\nencrypted form without the need for deciphering it first.\\nAs a result, data remains secure and private by performing\\noperations directly on encrypted data. By using specially\\ndevised mathematical operations, the underlying structure\\nof data is preserved while it is processed, so that the result\\nof the computation, which is also encrypted, stays the\\nsame. Only authorized parties having the decryption key\\ncan access this information. Homomorphic computing\\ncan be an effective way to implement privacy-aware pre-\\nprocessing, training and inference in AI-based systems.\\n• Finally, DP enables processing and learning from data\\nwhile minimizing the risk of identifying individuals in\\nthe dataset at hand. To this end, DP injects random noise\\nto the data before it is processed. This noise is calibrated\\nto guarantee that the data remains statistically accurate,\\nwhile concealing any information that could be used to\\nidentify individuals and thereby, compromise their pri-\\nvacy. The amount of noise added to data balances between\\nthe level of privacy protection provided by DP and the\\nperformance degradation of the AI-based system when\\ncompared to the case when no noise is injected.\\nBy resorting to any of the above technologies (also\\ncombinations of them), the privacy of individuals in the\\ndatasets is preserved, minimizing their risk of harm.\\n5.4.2. Data governance: Quality and integrity of data\\nand access to data\\nData protocols must govern data integrity and access for\\nall individuals even if these are not users of the AI system.\\nOnly duly qualified staff, with explicit need and competence,\\nshould be allowed to access individuals’ data. As a part of AI\\ngovernance, data governance calls for a broader level regula-\\ntion than a single country or continent regulation. This con-\\ntext has motivated guidelines and recommendations for AI\\ngovernance over the years emphasizing on the importance\\nof ensuring data quality, integrity and access. An example\\ncan be found in the Universal Guidelines for AI published\\nin 2018 [50], which were endorsed by 70 organizations and\\n300 experts across 40 countries. In these guidelines, Data\\nQuality Obligation was established as one of the principles\\nthat should be incorporated into ethical standards, adopted\\nin regulations and international agreements, and embedded\\ninto the design of AI-based systems. These recommenda-\\ntions helped inform the OECD AI Principles (2019), the\\nUNESCO Recommendation on AI Ethics (2021), the OSTP\\nAI Bill of Rights (2022), and the EU AI Act and the Council\\nof Europe Convention on AI.\\nIn terms of guidelines to implement data governance,\\nthe Information Commissioner’s Officer (ICO) has proposed\\nrecommendations on how to use AI and personal data appro-\\npriately and lawfully [51]. Among these, there are actions\\nsuch as taking a risk-based approach when developing and\\ndeploying AI – “addressing risk of bias and discrimination at\\nan early stage”, “ensuring that human reviews of decisions\\nmade by AI is meaningful”, “collect only data needed and\\nno more”, and “working with external suppliers to ensure\\nthe use of AI will be appropriate”.\\nAt the European level, the European Strategy for Data\\nestablished in 2020 aims at making the EU a role model\\nfor a society empowered by data. This strategy has given\\nrise to the European Data Governance Act [52] to facilitate\\ndata sharing across sectors and Member States. In particular,\\nthe EU Data Governance Act intends to make public sector\\ndata available for re-use, promote data sharing among busi-\\nnesses, allow the use of personal data through a “personal\\ndata-sharing intermediary”, help exercising rights under the\\nGeneral Data Protection Regulation (GDPR), and allowing\\ndata use on altruistic grounds [52].\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 12 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nLater in 2022, the European Union strategy for data\\nproposed the Data Act [53]14, a regulation harmonizing rules\\non fair access to and use of data. In practice, this regulation\\ncomplements the Data Governance Act by specifying who\\ncan create value from data and under which circumstances.\\nIn practice, the Data Act will take action to 1) increase legal\\ncertainty for companies and consumers who generate data,\\non who can use what data and under which conditions, 2)\\nprevent abuse of contractual imbalances that hinder fair data\\nsharing. 3) provide means to the public sector to access data\\nof interest held by the private sector; 4) set the framework\\nconditions for customers. Therefore, the benefits of the Data\\nAct for consumers and business include, from achieving\\ncheaper aftermarket prices for connected objects, to new\\nopportunities to use services based on data access, and\\nbetter access to data produced by devices. Serving these two\\nEU regulations, ten European common data spaces exist,\\nranging from industry to mobility .\\n5.5. Requirement 4: Transparency\\nWHAT does it mean? Transparency is the property that\\nensures appropriate information reaches the relevant stake-\\nholders [29]. When it comes to AI-based systems, differ-\\nent levels of transparency can be distinguished [54]: sim-\\nulatability (of the model by a human), its decomposability\\n(the ability to explain the model behavior and its parts),\\nand algorithmic transparency (understanding the process of\\nthe model and how it will act for any output). Another\\nclassification establishes transparency at the algorithmic,\\ninteraction and social levels [55], emphasizing the role of the\\nstakeholder audience to which the explanation is targeted:\\ndeveloper, designer, owner, user, regulator or society.\\nWHY is it important for trustworthiness? In the context\\nof trustworthy AI systems, data, the system itself and AI\\nbusiness models should be transparent. Humans must be\\ninformed of systems capabilities and limitations and always\\nbe aware that they are interacting with AI systems [3].\\nTherefore, explanations should be timely, adapted and com-\\nmunicated to the stakeholder audience concerned (layperson\\nregulator, researcher or other stakeholder), and traceability\\nof AI systems should be ensured.\\nHOW can this requirement be met in practice? The\\ndimensions to be treated within this requirement concern\\ntraceability, explainability and communication, which are\\nessential for realizing transparent AI-based systems. In the\\nfollowing we will first explain what traceability stands for\\n(Subsection 5.5.1), the current state of the art on explainable\\nAI (Subsection 5.5.2), and mechanisms for communicating\\nAI systems decisions (Subsection 5.5.3).\\n14Data Act & Data Act Factsheet, https://digital-strategy.ec.europ\\na.eu/en/policies/data-act, accessed on April 25th, 2023.\\n5.5.1. Traceability\\nTraceability is defined as the set of mechanisms and\\nprocedures aimed to keep track of the system’s data, devel-\\nopment and deployment processes, normally through doc-\\numented recorded identification [14]. Traceability and log-\\nging from the early design stages of the AI-based systems\\ncan help auditing and achieving the required level of trans-\\nparency according to the needs of the concerned audience.\\nIn this regard, special attention must be paid to prove-\\nnance tools [56], which ease the traceability or lineage of\\ndata and model decisions, hence contributing to the require-\\nment of transparency. In this area, the use of Blockchain\\nmechanisms are promising towards guaranteeing the in-\\ntegrity of data used to train (and explain) machine learning\\nmodels, i.e., the provenance of databases, their associated\\nquality, bias and fairness.\\n5.5.2. Explainability\\nThe so-called eXplainable AI (XAI) [54] field is widely\\nand globally being recognized as a crucial feature for the\\npractical deployment of trustworthy AI models. Existing\\nliterature and contributions already made in this field include\\nbroad insights into what is yet to be achieved [54, 57, 58].\\nEfforts have been invested towards defining explainability\\nin machine learning, extending previous conceptual propo-\\nsitions and requirements for responsible AI by focusing on\\nthe role of the particular audience for which explanations are\\nto be generated [54]: Given an audience, an explainable AI is\\none that produces details or reasons to make its functioning\\nclear or easy to understand.\\nExplainability techniques are blooming as tools to sup-\\nport algorithmic auditing. They have emerged as a necessary\\nstep to validate and understand the knowledge captured by\\nblack-box models, i.e., a system in which only inputs and\\noutputs are observed without knowing the internal details\\nof how it works. This can be problematic, as we cannot\\npredict how the system may behave in unexpected situations\\n(connecting with the technical robustness requirement, Sub-\\nsection 5.3), or how it can be corrected if something goes\\nwrong (linked to the accountability requirement, Subsec-\\ntion 5.8). Explaining which input factors contribute to the\\ndecisions of complex black-box algorithms can provide a\\nuseful global view of how the model works, jointly with\\ntraceability methods and a clear and adapted communication\\nof information to the target audience.\\nSince the quality of explanations depends on the audi-\\nence and the motivation for which they are generated, several\\ntaxonomies of XAI techniques have been proposed over\\nthe years [54]. A primary distinction can be done between\\nmodel-agnostic and model-specific approaches to explaining\\nmachine learning models, the difference being whether the\\nXAI technique can be applied to any machine learning model\\ndisregarding their structure and learning algorithm. Another\\ndistinction can be done between ex-ante and post-hoc XAI\\ntechniques, depending on the moment at which explainabil-\\nity is addressed (before or after the model is designed and\\ntrained). On one hand, ex-ante techniques (also referred to as\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 13 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nthe explainable-by-design paradigm) make AI models aspire\\nto provide an explanation that avoids the construction of\\nadditional models or extra complexity (layers or mechanisms\\nnot originally part of the original one), so that explanations\\nare as faithful to the real reasoning carried out by the model\\nas possible. On the other hand, post-hoc XAI techniques\\nusually add artifacts around the original AI model or build a\\nsurrogate of it – a local approximation or simpler version\\nof the original one – in order to more easily explain the\\noriginal one (for example, LIME [59]). Likewise, some XAI\\ntechniques may use external knowledge (e.g. from the web,\\nWikipedia, forums) [60], for instance, to explain language\\nmodels or dialogue models that interactively answer ques-\\ntions about a model’s particular decision.\\nOther criteria to categorize XAI tools can be formulated,\\nsuch as the format of the issued explanations (e.g., attri-\\nbution methods [61], counterfactual studies [62], simplified\\nmodel surrogates [63]) or the hybridization of explanations\\nexpressed in different modalities, such as visual and textual\\n(e.g., linguistic summaries [64], ontologies [65], or logical\\nrules defined on top of knowledge graphs [66], to cite a\\nfew). Natural language explanations [67, 27], quantitative\\nmeasures of the quality of explanations [68, 69], and models\\nthat support their learning process with formal symbolic\\nbasis such as language, rules, compositional relationships or\\nknowledge graphs (neural-symbolic learning and reasoning\\n[66]) are key for explanations to be understood by non-expert\\naudience. These interfaces allow such users to assess the\\noperation of the model in a more intelligible fashion, hence\\nsupporting the human agency and oversight requirement for\\ntrustworthy AI systems (Subsection 5.2).\\n5.5.3. Communication\\nThe third dimension of transparency is how the audience\\nis informed about the AI-based system, namely, how expla-\\nnations or information tracked about the system’s operation\\nis communicated to the user. Humans should know when\\nthey are interacting with AI systems, as well as be notified\\nabout their performance, instructed about their capabilities,\\nand warned about their limitations. The same holds when\\nconveying the model’s output explanation and its function-\\ning to the user. The adaptation of the explanation must be in\\naccordance to the specifics of the AI system being explained\\nand the cognitive capabilities (knowledge, background ex-\\npertise) of the audience.\\nTherefore, communication is a crucial dimension, so\\nthat all aspects related to transparency are delivered to the\\naudience in a form and format adapted to their background\\nand knowledge. This is key to attain trust in the audience\\nabout the AI-based system at hand.\\n5.6. Requirement 5: Diversity, non-discrimination\\nand fairness\\nWHAT does it mean? This requirement contemplates dif-\\nferent dimensions: the avoidance of unfair bias, diversity\\nfostering, accessibility to all regardless any disability, and\\nthe involvement of stakeholders in the entire AI system life\\ncycle. All these dimensions of this manifold requirement\\nshare an ultimate purpose: to ensure that AI-based systems\\ndo not deceive humans nor limit their freedom of choice\\nwithout reason. Therefore, it is a requirement tightly linked\\nto the ethical and fairness principles that underlie trustwor-\\nthiness in AI (Section 2).\\nWHY is it important for trustworthiness? This require-\\nment is necessary to broaden the impact of AI to all social\\nsubstrates, as well as to minimize the negative effects that\\nautomated decisions may have in practice if data inherits\\nbiases hidden in the data from which models are learned.\\nUnfair bias in data must be avoided as decisions drawn by a\\nmodel learned from such data could have multiple negative\\nimplications, including the marginalization of vulnerable\\ngroups, the exacerbation of prejudice or discrimination [3].\\nHOW can this requirement be met in practice? Meth-\\nods to tackle this requirement can be classified depending on\\nthe specific dimension they support: as such, methods to en-\\nforce diversity, non-discrimination, accessibility, universal\\ndesign and stakeholder participation are briefly revisited in\\nSubsection 5.6.1, whereas Subsection 5.6.2 describes how to\\nachieve fairness in AI-based systems. Finally, Section 5.6.3\\nexamines mechanisms to avoid unfair bias.\\n5.6.1. Diversity, non-discrimination, accessibility,\\nuniversal design and stakeholder participation\\nAI systems should take into account all human abilities,\\nskills and requirements, and ensure accessibility to them.\\nDeveloping methodologies based on the requirement of non-\\ndiscrimination and bias mitigation is paramount to ensure\\nthe alignment of requirements to the compliance with ethical\\nvalues. Assuring properties of non-discrimination, fairness\\nand bias mitigation restrict the systematic differences treat-\\ning certain groups (of people or objects) with respect to\\nothers [29]. A practical example of recommendation encour-\\nages, e.g., hiring from diverse backgrounds, cultures and\\ndisciplines to assure opinion diversity.\\nThis requirement involves the inclusion of diverse data\\nand people, and ensures that individuals at risk of exclusion\\nhave equal access to AI benefits. Concrete implementations\\nof this requirement range from quantifying the impact of\\ndemographic imbalance [70] and the effects of missing data\\n(which, as a matter of fact, has been shown to be beneficial\\nin terms of fairness metrics [71]).\\nIn what refers to diversity, it advocates for the needs for\\nheterogeneous and randomly sampling procedures for data\\nacquisition, diverse representation of a population that in-\\ncludes minorities, and the assurance for non-discriminating\\nautomated processes that lead to unfairness or biased mod-\\nels. Diversity can be applied at the technical level dur-\\ning model training by penalizing the lack of diverse pro-\\ntotypes on latent space areas with challenging separation\\nbetween classes [72]. Alternatively, the lack of diversity\\ncan be counteracted by means of imbalanced learning or\\nby informing data augmentation. When placing the focus\\non the solutions of an AI-based system, their diversity is a\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 14 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nvery relevant component to guarantee non-biased outcomes.\\nFrameworks unifying quality and diversity optimization can\\nguarantee the diversity of generated solutions that may later\\nserve in robotics to learn behaviorally diverse policies [73].\\nFrom a global perspective, the so-called discrimination-\\nconscious by-design paradigm collective refers to method-\\nologies where discrimination detection and prevention is\\nconsidered from the beginning of the design of the AI-\\nbased system through fairness [74]. Methods adopting this\\nparadigm include discrimination-aware data mining [75],\\ncompositional fairness, interpretation of sanity checks and\\nablation studies.\\nIn summary, diversity must be enforced both in the data\\nfrom which models are learned and among the stakeholders,\\ni.e., fostering the inclusion of minorities (practitioners and\\nusers) [7, 22] or using methodologies such as participatory\\ndesign for accessibility [76]. Universal Design principles,\\nwhich consider accessibility and “design for all” [3] dur-\\ning development, governance, policy and decision making\\nprocesses is one way to facilitate AI life cycles that take\\ninto account what is beneficial for everyone, accounting for\\ndifferent conditions and situations, and no discrimination.\\nTo further enable universal design and stakeholder par-\\nticipation, often feedback – even after deployment – is\\nsought for stakeholder participation and consultation. One\\nway to achieve this is through active learning for machine\\nlearning systems. Active learning allows for the integration\\nof users’ feedback while models are learned, and enables\\ninteractivity with the user, one of the goals targeted by\\nhuman-centered AI [77] and AI for social good [23].\\n5.6.2. Fairness\\nThe second dimension of this requirement is fairness,\\nnamely, techniques aimed to reduce the presence of unfair\\noutputs elicited by AI-based systems. An unfair algorithm\\ncan be defined as that producing decisions that favor a partic-\\nular group of people. Following the comprehensive view on\\nthis topic published in [78], biases leading to such unfair de-\\ncisions can be propagated from the data to the AI algorithm\\n(including measurement, omitted variable sampling, or rep-\\nresentation biases, among others); from the algorithm to the\\nuser (as in algorithmic, popularity or evaluation biases); or\\nfrom the user to the data (respectively, biases induced in\\nthe production of content, temporal, historical and/or social\\nbiases).\\nFairness guarantees in the decisions of AI-based systems\\nhas been approached extensively in the literature, reporting\\nbias targeting methods that can be classified in three large\\ngroups:\\n• Pre-processing methods, where the available data are\\ntransformed for the source of bias to be reduced and at\\nbest, removed.\\n• In-processing methods, which modify the learning algo-\\nrithm of the model at hand (by e.g. changing the objective\\nfunction at hand or imposing constraints to the optimiza-\\ntion problem) so as to minimize the effect of biases in the\\ntraining process.\\n• Post-processing methods, which operate on the output\\nof the model (for instance, by reassigning the predicted\\nclass for a query instance) without modifying its learning\\nalgorithm or the training data from which it was learned.\\nIn general, it is widely acknowledged that fairness can\\nbe achieved by sacrificing accuracy to a certain extent [79].\\nHowever, it is also possible to debias machine learning mod-\\nels from the influence of spurious features to even improve\\ntheir performance [80]. Another trade-off is between fairness\\nand privacy. Here, adversarial learning [81] can simultane-\\nously learn a predictor and an adversary that models a pro-\\ntected variable, and by minimizing the adversary capacity to\\npredict this protected variable, accurate predictions can show\\nless stereotyping of the protected variable, almost achieving\\nequality of odds as a fairness notion.\\nAn important concept to be acquainted with when deal-\\ning with fairness in AI-based systems is fairwashing: as a\\nrisk of rationalization, fairwashing is the promotion of a false\\nperception that a machine learning model respects ethical\\nvalues through an outcome explanation and fairness metric\\n[82]. This makes it critical to characterize the manipulability\\nof fairwashing [83], as well as LaundryML approaches [82]\\nto better audit unfair opaque models.\\n5.6.3. Avoidance of unfair bias\\nData and models can be exposed to a large set of potential\\nbias-inducing phenomena. Ensuring diversity, representa-\\ntiveness and completeness in data and models needs to be\\na core part of the full AI life cycle (design, development\\nand deployment phases of AI-based systems). Bias can be\\nuncovered through proxy discrimination by models, since\\nproxy variables are likely to be picked up, showing features\\nas proxy that otherwise would not have been considered, i.e.,\\nzip codes in predictive policing [44]. As has been shown in\\nthe previous dimension, bias is not only algorithmic, but ex-\\ntends beyond the limits of models in a vicious cycle: starting\\nwith human activity bias, data bias, leads to sampling bias\\non the web (specially to be considered in the use of data to\\nlearn generative models), algorithmic bias, interaction bias\\nand finally, self-selection bias that can revert back into the\\nalgorithm a second-order bias [84].\\nBias mitigation techniques include several approaches\\n[45, 85], from generic requirements and toolboxes [86] to\\nconcrete taxonomies of bias [87, 88] at different stages of\\nthe AI life cycle [89]. Different notions of fairness can be\\nalso defined [78, 90], including causal fairness – which relies\\non causal relations and requires establishing causal graphs\\n– or counterfactual fairness. Causality can help debugging\\nalgorithmic bias mitigation or explaining models [91], e.g.,\\ncausal mediation analysis can help uncover disparate impact\\nof models by estimating the fairness associated to different\\nexplaining variables [92].\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 15 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n5.7. Requirement 6: Societal and environmental\\nwellbeing\\nWHAT does it mean? AI-based systems should benefit all\\nhumankind, not only at the present time but also in future\\ngenerations. Therefore, AI-based systems must be sustain-\\nable and environmentally friendly, so that the technological\\nadoption of AI does not entail a progressive depletion of\\nnatural resources and maintains an ecological balance [29].\\nTherefore, dimensions supporting this requirement include\\nsustainability and environmental friendliness, as well as a\\ncareful assessment of the social and societal impacts of AI.\\nWHY is it important for trustworthiness? AI systems\\nshould increase positive social change and enhance sus-\\ntainability and ecological responsibility. Although they can\\nbe an effective tool to mitigate climate change [93, 94],\\ngreenhouse gases emitted by the computationally intensive\\ntraining processes of complex AI-based systems can exacer-\\nbate existing social and ethical challenges linked to AI [95].\\nFor instance, training only one single AI model can emit\\nas many CO2 emissions as five cars in their entire lifetime.\\nComputational and environmental costs grow proportionally\\nto the complexity of the model in terms of its number\\nof parameters [96]. In particular, this study was done for\\nlarge language models [97] which cost about 8.4 tons per\\nyear, where an average carbon footprint for a person yearly\\nis around 4. Although emissions are amortized over the\\nmodel lifetime, the recent ChatGPT model was estimated\\nto consume 1,287 MWh that translates into a cost of 522\\ntCO2e [98]. Therefore, energy and policy considerations\\nare to be taken into account by institutions and companies\\nimplementing AI [97].\\nHOW can this requirement be met in practice? This\\nrequirement is currently approached from two different an-\\ngles: sustainability and environmental wellbeing (Subsec-\\ntion 5.7.1) and societal wellbeing (Subsection 5.7.2).\\n5.7.1. Sustainability and environmental wellbeing\\nSustainable AI [99] considers a holistic perspective that\\nspans from models to data algorithms and hardware, and\\nhow software-hardware co-design can help mitigate carbon\\nfootprints of AI model life cycles (design, training and de-\\nployment stages). As mentioned previously, sustainable AI\\nfinds its motivation in the costly energy consumption of large\\nAI models. Thus, sharing key learned lessons, best design\\npractices, metrics, and standards is key for a sustainable\\ndevelopment of AI systems. Technical contributions aimed\\nto implement this requirement for the sustainability of AI\\nare at the core of the Green AI research area [100], which\\nstudies efficient and ecologically aware designs of AI-based\\nalgorithms, systems and assets.\\nMany strategies to attain this requirement have been\\nproposed over the years to reduce the environmental impact\\nof AI models, with emphasis on those characterized by\\na large number of parameters and requiring long training\\nlatencies (e.g., deep neural networks). Among others:\\n• Assessment of the environmental impact of AI-based\\nsystems with e.g., carbon footprint calculators15 [101].\\nEvaluating the factors that influence AI’s greenhouse gas\\nemissions is the first step towards mitigating its negative\\neffects [95].\\n• Selection of the most relevant and necessary data, i.e.,\\nwith smart data approaches [102].\\n• Model compression [103, 104], e.g. using quantization\\n[105], distillation techniques [106, 107] or acceleration\\n[108] techniques.\\n• Consideration of efficiency as an evaluation metric and as\\na price tag to make models greener and more inclusive for\\nresearchers having limited resources [100].\\n• Use of models that can rapidly adapt to new situations,\\ndomains and similar tasks by virtue of learning function-\\nalities specifically devoted to this adaptation (e.g., mul-\\ntitask, few-shot learning, AutoML, meta-learning, neural\\narchitecture search or open-ended learning. This family\\nof GPAIS can provide more efficient, sustainable and less\\ndata depending AI systems.\\n• Deployment of models on cloud computing servers fed\\nwith renewable energy sources, to minimize CO2 emis-\\nsions.\\n5.7.2. Societal wellbeing\\nAt the societal level, AI can improve social welfare. AI-\\nbased systems can perform routine tasks in an autonomous\\nsafer, and more efficient fashion, enhancing productivity\\nand improving the quality of life of humankind. In the\\npublic administration AI can speed up processes, smooth ad-\\nministrative bottlenecks and save paperwork. Furthermore,\\nit can aid policy making and help city planners, e.g., by\\nvisualizing the consequences of climate change, predicting\\nfuture floods, or identifying urban heat islands. Possibilities\\nfor the society at large to benefit from AI developments have\\nexploded in recent years with the progressive digitization of\\nalmost all sectors of activity. Infrastructure planning, health\\nand hunger, equality and inclusion, education, economic\\nempowerment, security and justice are among those sectors\\nwhere AI can unleash its full potential to foster use cases of\\nsocietal impact.\\nBringing such benefits of AI into practice is, therefore,\\na matter of leveraging such amounts of available data in\\nAI-based systems. Such AI-based systems address learning\\ntasks that solve a problem of societal impact, such as the\\nones exemplified above. However, since decisions issued by\\nthe AI-based system affect human beings and are subject\\nto social scrutiny, other requirements of trustworthy AI\\nbecome of utmost relevance, including fairness, privacy,\\ntransparency or human oversight. Above all, the importance\\nof AI ethics and regulation becomes paramount in societal\\n15Greenhouse Gases Equivalencies Calculator, https://www.epa.gov/en\\nergy/greenhouse-gases-equivalencies-calculator-calculations-and-refer\\nences, accessed on April 25th, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 16 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nwellbeing, since decisions issued in use cases arising in edu-\\ncation, justice and security have to comply with fundamental\\nhuman rights and the legal restrictions in force.\\n5.8. Requirement 7: Accountability\\nWHAT does it mean? This last requirement of trustworthy\\nAI systems imposes the provision of mechanisms to ensure\\nresponsibility and accountability for the development, de-\\nployment, maintenance and-or use of AI systems and their\\noutcomes. Auditability, which enables the assessment of\\nalgorithms, data and design processes, plays a key role in\\naccountability, namely, the attribution of the results of the\\nactions that were taken based on the outcome of the AI-\\nbased system. Accountability, therefore, implies the min-\\nimization of harm and reporting of negative impact, the\\ncommunication of design trade-offs to the user, and the\\nimplementation of adequate and accessible redress strategies\\nassociated to AI-based systems. Therefore, auditability and\\naccountability are closely related to each other and lie at the\\ncore of responsible AI systems, which are later discussed in\\nSection 6.1.\\nWHY is it important for trustworthiness? The required\\nauditability property of Trustworthy AI systems demands\\nthe development of practical tools [109] that are capable\\nof verifying desirable properties of neural networks such\\nas stability, sensitivity, relevance or reachability [110], as\\nwell as metrics beyond explainability [111, 112, 113, 114,\\n115], such as on traceability, data quality and integrity.\\nAuditability is becoming increasingly important when stan-\\ndards are being materialized touching upon all AI require-\\nments. This includes IEEE, ISO/IEC and CEN/CENELEC,\\nwhich are implementing concrete guidelines to apply trust-\\nworthy AI requirements in industrial setups (see [29, 11]\\nfor an overview). At the national level, the German stan-\\ndardization road map on AI within DIN/DKE [11] is a\\nclear exponent of the standardization efforts made by differ-\\nent governments to dictate how practical AI-based systems\\nshould be audited.\\nOn the other hand, accountability is a key requirement\\nto be able to recourse [116] when an AI model contributes\\nto making a proven wrong decision, issuing explanations\\nand recommendations to cases that are unfavorably treated\\nby such decision. Accountability is a matter of compliance\\nwith ethical and legal standards, answerability, reporting and\\noversight, and attribution and enforcement of consequences\\n[117]. Therefore, when framed under AI regulatory stan-\\ndards and ethical principles like the ones discussed in this\\nwork, accountability becomes crucial for AI-based systems\\nto distribute cost, risks, burdens and liabilities among the\\ndifferent stakeholders participating in its life cycle.\\nHOW can this requirement be met in practice? Simi-\\nlarly to other requirements, we next analyze how the different\\ndimensions spanned by this requirement can be tackled\\nin practice. In doing so, Subsection 5.8.1 deals with ac-\\ncountability, whereas Subsection 5.8.2 addresses auditabil-\\nity. The minimization and reporting of negative impacts\\nis discussed in Subsection 5.8.3. Finally, Subsection 5.8.4\\ndescribes methods for algorithmic redress.\\n5.8.1. Accountability\\nMechanisms of accountability are especially relevant in\\nhigh-risk scenarios, as they assign responsibility for deci-\\nsions in the design, development and deployment phases\\nof the AI system. Tools to attain this requirement involve\\nalgorithmic accountability policy toolkits (e.g., [118]), the\\npost-hoc analysis of the output of the model (e.g. via local\\nrelevance attribution methods) or algorithms for causal in-\\nference and reasoning [119]. Since accountability is linked\\nto the principle of fairness, it is closely related to risk man-\\nagement since unfair adverse effects can occur. Therefore,\\nrisks must be identified and mitigated transparently so they\\ncan be explained to and verified by third parties. Therefore,\\ntechniques and tools for auditing data, algorithms and de-\\nsign processes are required for accountable decisions issued\\nby AI-based systems. An overview on 16 risk assessment\\nframeworks is available in [120], whereas built-in derisking\\nprocesses at design and development phases can be found in\\n[86, 121]. These processes operationalize risk management\\nin machine learning pipelines, including explainability and\\nbias mitigation. Another set of resources to tackle bias and\\nfairness are discussed in [86].\\nEmerging trade-offs between requirements should be\\nstated and assessed with regards to the risk they pose to\\nethical requirements and compromise of fundamental rights,\\nsince no AI system should be used when no risk-free trade-\\noff for these can be found [3]. Consequently, AI models use-\\nful for accountability often involve multi-criteria decision\\nmaking and pipelines at the MLOps level that help delineate\\nand inform such trade-offs to the user.\\n5.8.2. Auditability\\nThe AI Act has been interpreted as the European ecosys-\\ntem to conduct AI auditing [122]. In the strict sense, the\\nneed for certifying systems that embed AI-based function-\\nalities in their design is starting to permeate even within\\nthe international ISO standards for AI robustness. In such\\nstandards, formal methods for requirement verification or\\nrequirement satisfaction, typical of software engineering,\\nare being extended towards verifying desirable properties\\nof AI models. More specifically, in order to certify neural\\nnetworks, properties such as stability, sensitivity, relevance\\nor reachability are sought [110].\\nIn terms of auditing procedures, especially when the AI\\nsystem interacts with users, grading schemes adapted to the\\nuse case [123] are in need for validating models. Examples\\ninclude the System Causability Scale [123] or the Muir\\nTrust Scale [124], which are widely adopted in human robot\\ninteraction and robotics and rely on predictability (To what\\nextent the robot behavior [the output of the AI-based system]\\ncan be predicted from moment to moment?), reliability (To\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 17 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nwhat extent can you count on the system to do its job?),\\ncompetence (What degree of faith does the user have on the\\nsystem for it to cope with similar situations in the future?)\\nand trust (How much does the user trust the system overall?).\\n5.8.3. Minimization and reporting of negative impacts\\nand trade-offs\\nThe urgent need for developing stable and verifiable\\nmechanisms for auditing AI-based systems becomes more\\nrelevant in the case of generative AI, which has grown so\\nmaturely that it is difficult to distinguish between human-\\ncreated multimodal content and those generated by ma-\\nchines. If these are not properly identified, they can gen-\\nerate confusion and deception, which may have negative\\nconsequences for society, such as the manipulation of public\\nopinion or the dissemination of fake news.\\nA promising stream along these lines proposes to land\\nthe implementation of verifiable claims [125], which are\\ndefined as those falsifiable claims for which evidence and\\narguments can be provided to influence the probability that\\nsuch claims are true. This proposal stems from the efforts\\nof developers, regulators and other AI stakeholders, and the\\nneed to understand what properties of AI systems can be\\ncredibly demonstrated, through what means, and what trade-\\noffs or commitments should and can be quantified. While the\\ndegree of certainty achievable varies across different claims\\nand contexts, the idea is to demonstrate that greater degrees\\nof evidence can be provided for claims about AI develop-\\nment than is typically done today to facilitate auditing them.\\n5.8.4. Redress\\nLastly, once the risk has turned into a confirmed incident,\\nit is paramount that the user is aware of the possibility\\nto redress, preserving his/her trust when adverse or unfair\\nimpact takes place [3]. Redress is related to the concept of\\nalgorithmic recourse [116], and consists of a procedure to\\ncorrect or reverse an AI system outcome that is considered\\nwrong. A key to trustworthy AI is ensuring adequate redress\\nagainst decisions made by AI systems and by humans op-\\nerating them through accessible mechanisms to their users\\nwhen these fail, without forgetting vulnerable persons or\\ncollectives. Redress mechanisms are to be ensured, and com-\\nplemented with accountability frameworks and disclaimers,\\nsince certification will obey particular application domains,\\nand cannot replace responsibility. Machine unlearning [126],\\ncounterfactual explanations [127] or the analysis of disparate\\nimpacts [128] can be also regarded as techniques that can\\nsupport redress in AI-based systems.\\n6. Trustworthy Artificial Intelligence from\\ntheory to practice and regulation:\\nresponsible Artificial Intelligence systems\\nSo far we have exposed the vision of trustworthy AI that\\nhas been tackled in most of the literature: from a theoretical\\npoint of view, and mainly based on principles and recom-\\nmendations. In this section we highlight the importance of\\ntackling trustworthy AI from a practical perspective. A clear\\nmapping from trustworthy AI principles and requirements\\ninto operative protocols that can be automated, verified and\\naudited does not always exist. To achieve this, the field\\nneeds blueprints and standard models to be adopted and\\nstandardized. In what follows we stress on the utmost impor-\\ntance of having practical regulatory scenarios (regulatory\\nsandboxes) and the final output of processes implementing\\nHRAIs using trustworthy AI: a responsible AI system.\\nAccording to this idea, the section is organized as fol-\\nlows. First, Subsection 6.1 defines the nuanced yet necessary\\nnotion of responsible AI systems, to comply with both\\ntrustworthy AI requirements and the law in force. Then,\\nSubsection 6.2 describes the technical requirements that the\\nimplementation of HRAIs will legally require in practice.\\nThen, Subsection 6.3 presents how these requirements are\\ngoing to be evaluated by regulators and auditors through\\nregulatory sandboxes. Subsection 6.4 examines whether all\\nthese steps can be connected and applied through a blueprint\\nproposal to implement trustworthy AI in healthcare. Finally,\\nSubsection 6.5 examines the implications of new HRAIS and\\nemerging AI systems, justifying the necessity of a dynamic\\nregulation and flexible evaluation protocols to deal with new\\nhigh-risk scenarios supported by these systems.\\n6.1. Responsible Artificial Intelligence systems\\nA little prior to trustworthy AI is the term responsible\\nAI, which has been widely used quite as a synonym. How-\\never, it is necessary to make an explicit statement on the\\nsimilarities and differences that can be established between\\ntrustworthy and responsible AI. The main aspects that make\\nsuch concepts differ from each other is that responsible\\nAI emphasizes the ethical use of an AI-based system, its\\nauditability, accountability, and liability.\\nIn general, when referring to responsibility over a cer-\\ntain task, the person in charge of the task assumes the\\nconsequences of his/her actions/decisions to undertake the\\ntask, whether they result to be eventually right or wrong.\\nWhen translating this concept of responsibility to AI-based\\nsystems, decisions issued by the system in question must be\\naccountable, legally compliant, and ethical. Other require-\\nments for trustworthy AI reviewed in this manuscript (such\\nas robustness or sustainability) are not relevant to responsi-\\nbility. Therefore, trustworthy AI provides a broader umbrella\\nthat contains responsible AI and extends it towards consid-\\nering other requirements that contribute to the generation of\\ntrust in the system. It is also worth mentioning that providing\\nresponsibility over AI products links to the provision of\\nmechanisms for algorithmic auditing (auditability), which\\nis part of requirement 7 (Accountability, Subsection 5.8).\\nStressing on the importance of a responsible development\\nof AI, we now define the responsibility associated to AI\\nsystems, following the discussed features.\\nDefinition. A Responsible AI system requires ensuring au-\\nditability and accountability during its design, development\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 18 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nand use, according to specifications and the applicable reg-\\nulation of the domain of practice in which the AI system is\\nto be used.\\nIn the following we discuss in depth these features:\\n1. Auditability: As an element to aid accountability, a thor-\\nough auditing process aims to validate the conformity\\nof the AI-based asset under target to 1) vertical or sec-\\ntorial regulatory constraints; 2) horizontal or AI-wide\\nregulations (e.g., EU AI Act); and 3) specifications and\\nconstraints imposed by the application for which it is\\ndesigned. It is important to note that auditability refers\\nto a property sought for the AI-based system, which\\nmay require transparency (e.g. explainability methods,\\ntraceability), measures to guarantee technical robustness,\\netc. This being said, the auditability of a responsible\\nAI system may not necessarily cover all requirements\\nfor trustworthy AI, but rather those foretold by ethics,\\nregulation, specifications and protocol testing adapted to\\nthe application sector (i.e., vertical regulation).\\n2. Accountability: which establishes the liability of deci-\\nsions derived from the AI system’s output, once its com-\\npliance with the regulations, guidelines and specifica-\\ntions imposed by the application for which it is designed\\nhas been audited. Again, accountability may comprise\\ndifferent levels of compliance with the requirements for\\ntrustworthy AI defined previously.\\nIn the context of the European approach and AI Act,\\nthis translates into a required pre-market use of regulatory\\nsandboxes, and the adaptability of the requirements and reg-\\nulation for trustworthy AI into a framework for the domain\\nof practice of the AI system.\\n6.2. Artificial Intelligence systems’ compliance\\nwith regulation in high-risk scenarios\\nIt has been concluded in the previous section that the\\nconformity of requirements are central for the definition of\\nresponsible AI systems. In Europe, regulatory requirements\\nin force for the deployment of AI-based systems are pre-\\nscribed based on the risk of such systems to cause harm.\\nIndeed, the AI Act agreed by the European Parliament, the\\nCouncil of the European Union, and the European Com-\\nmission, is foreseen to set a landmark piece of legislation\\ngoverning the use of AI in Europe and regulating this tech-\\nnology based on the definition of different levels of risks:\\nminimal, limited and HRAIs. In these categories different\\nrequirements for trustworthy AI and levels of compliance\\nare established, so that regulatory obligations are enforced\\ntherefrom.\\nFurthermore, the European Commission has also asked\\nthe European Committee for Standardization (CEN), the\\nEuropean Committee for Electrotechnical Standardization\\n(CENELEC) and the European Telecommunications Stan-\\ndards Institute (ETSI) to develop standards aimed to cover\\ndifferent practical aspects of AI systems, including foun-\\ndational AI standards, data standards related to AI, Big\\nData and analytics, use cases and applications, governance\\nimplications of AI and computational approaches of AI.\\nEthical, societal concerns and AI trustworthiness also appear\\namong the prioritized areas of these standardization bodies.\\nAmong these defined levels of risk associated to AI-\\nbased systems, those at the top of this classification (HRAIs)\\nare subject to stringent obligations. HRAIs are demanded\\nto comply with the AI Act through the fulfillment of the\\nfollowing seven requirements (AI Act, Chapter 2 [4]):\\n1. Adequate risk assessment and mitigation systems (Art. 9\\n- Risk management system).\\n2. High quality of the datasets feeding the system to mini-\\nmize risks and discriminatory outcomes (Art. 10 - Data\\nand data governance; Art. 9 - Risk management system).\\n3. Logging of activity to ensure traceability of results (Art.\\n12 - Record Keeping; 20 - Automatically generated logs).\\n4. Detailed documentation providing all information nec-\\nessary on the system and its purpose for authorities to\\nassess its compliance (Art. 11 - Technical documentation;\\nArt. 12 - Record-keeping).\\n5. Clear and adequate information to the user (Art. 13 -\\nTransparency).\\n6. Appropriate human oversight measures to minimise risk\\n(Art. 14 - Human oversight).\\n7. High level of robustness, security and accuracy (Art. 15\\n- Accuracy, robustness and cybersecurity).\\nHRAIs must undergo conformity assessments before\\nentering the EU market. One of the most complete guidance\\nprocedures assisting on complying with AI Act regulation is\\nthe CapAI procedure for conducting conformity assessment\\nof AI systems [129]. It describes notions and metrics, check-\\nlists and other procedures to comply with the new legislation.\\nSince the AI Act imposes obligations on providers, im-\\nporters, distributors, and users, the latter can be deemed as\\nproviders in certain cases. For instance, if a user slightly\\nmodifies or uses a ready-made AI-based product such as\\nChatGPT for a different purpose, this makes him/her become\\nresponsible and accountable for the system’s consequences,\\ndepending on the conditions that define HRAIs. This is why\\nin order to realize trustworthy AI that is compliant with\\nthe law, we advocate for the development of responsible\\nAI systems, i.e., systems that not only make a responsible\\nimplementation that fulfills the requirements for trustworthy\\nAI, but also comply with the AI regulation.\\nIn practice, HRAIs providers must work to make their\\nassets meet these requirements, including post-market mon-\\nitoring plans [122] (AI Act Art. 61 - Post-market monitoring\\nby providers and post-market monitoring plan for high-risk\\nAI systems) to document the performance throughout the\\nsystem’s lifetime, in a way that vague concepts become veri-\\nfiable criteria that strengthen the assessment safeguards and\\ninternal checks. Likewise, conformity assessments (AI Act,\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 19 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nArt. 19 and Art. 43) will be ruled by internationally harmo-\\nnized testing principles, guaranteeing high-quality testing.\\nThese tests can depart from ad-hoc procedures and protocols\\nfor the domain at hand. This is the case of the German stan-\\ndardization roadmap on AI [11], which proposes conformity\\nassessments based on several steps: calibration, inspection,\\naudit, validation and verification.\\nThis need for harmonized testing protocols, monitoring\\nplans and conformity assessment procedures is the main\\nreason for the emergence of the concept of AI regulatory\\nsandboxes, which are next detailed and discussed.\\n6.3. Artificial Intelligence regulatory sandboxes: A\\nchallenge for auditing algorithms\\nOnce requirements for HRAIs have been established, the\\nremaining challenge is to make the AI system comply with\\nthem appropriately. Such requisites (AI Act, Chapter 2, Art.\\n8-15) motivate the need for a test environment where to\\naudit AI-based systems by safe and harmonized procedures\\nestablished by the latter. Regulatory sandboxes are indeed\\nrecommended by the AI Act (Chapter 5, Art. 53-54). Con-\\ncretely, the AI Act establishes that algorithms should comply\\nwith regulation and can be tested in a safe environment\\nprior to entering the market. This auditing process can be\\nimplemented via regulatory sandboxes.\\nIn order to successfully undertake AI auditing processes\\nunder the new regulation, industry, academia and govern-\\nmental actors are forced to adapt their processes and teams\\nto comply with the law. Regulatory sandboxes act as test beds\\nand safe playgrounds that allow assessing the compliance of\\nAI systems with respect to regulation, risk mitigation strate-\\ngies, conformity assessments, accountability and auditing\\nprocesses established by the law. Figure 4 illustrates the two\\nstages where sandboxes play a crucial role: i) pre-market\\nauditability and conformity check, and ii) post-market mon-\\nitoring and accountability. The figure illustrates not only the\\ndifferent stakeholders participating in these two stages, but\\nalso the articles in the AI Act where each step within the\\nprocess is described.\\nIn the current context of rapidly evolving AI products,\\nsandboxes allow market stakeholders and business players to\\nexplore and experiment with new and innovative products,\\nservices or businesses under the supervision of a regulator.\\nHowever, the idea of resorting to a sandbox to explore,\\nevaluate and gauge complex technology is not new, nor ex-\\nclusive of AI systems. They have already been used in other\\ncontexts to test and validate Fintech [130] or Blockchain16\\ntechnologies in the European Union. The objective of these\\ncontrolled environments is to test innovative technologies for\\na limited time, for innovators and regulators to cooperate17.\\nThe AI Act also contains measures with the aim to reduce\\nthe regulatory burden on Small and Medium Enterprises\\n16Launch of the European Blockchain Regulatory Sandbox. https:\\n//digital-strategy.ec.europa.eu/en/news/launch-european-blockchai\\nn-regulatory-sandbox, accessed on April 25th, 2023.\\n17First regulatory sandbox on Artificial Intelligence presented: https:\\n//digital-strategy.ec.europa.eu/en/news/first-regulatory-sandbox-art\\nificial-intelligence-presented\\n(SMEs) and startups, prioritize them, and to reduce their\\ntime to market by ensuring legislation can be implemented\\nin two years. The intended goal is to support innovation\\nand small-scale providers, getting apart from the regulation\\nstifling innovation critique.\\nThe benefits of sandboxes is that they support the de-\\nvelopment, testing and validation of innovative AI systems\\nunder the direct supervision and guidance of competent\\nauthorities (AI Act Art. 53). Furthermore, they allow ex-\\nperimenting by derogation (by putting aside certain rules\\nor laws), and experimentation by devolution, which requires\\nbroad supra/national frameworks to establish guidelines that\\nempower and help local governments to establish a regula-\\ntion in a particular area. This enables differences among gov-\\nernment levels by considering local preferences and needs as\\na means to stimulate innovative policies.\\nWhen it comes to the challenges faced by sandboxes,\\nthere is a concern for the lack of proper methodological\\nassessments that are indicative of the possible impact of\\nAI on the society [131]. This concern fosters the need\\nfor cross-border and multi-jurisdictional regulatory sandbox\\nstandardization [132], as well as generic AI standardization\\n[133]. Governments will have to find a balance between EU\\ncoordination and national procedures to avoid conflicts in the\\nimplementation of the regulation [134]. Specifically in the\\nAI Act (Art. 53), participants in the sandbox remain liable\\nunder applicable liability legislation. Eligibility criteria and\\nparticipants obligations and rights is to be set up in imple-\\nmenting acts.\\nDerived from the above challenge, we note that sand-\\nboxes are still far from maturity. This leads to two main\\naspects that remain unresolved: 1) the design of sandboxes\\nwith guidelines that rapidly and effectively permit algorith-\\nmic auditing; and 2) the development of intelligent systems\\nfor high-risk scenarios that are validated through the nec-\\nessary auditing processes. Important efforts are currently\\ndriven towards addressing these aspects as two additional\\nfundamental challenges. At European level, Spain is leading\\na pilot to set up a regulatory sandbox according to the\\nEuropean AI Act legislation.\\nTogether with sandboxes to work in practice, additional\\nfuture mechanisms will include the certification or quality\\ncontrol within a regulatory framework. In this sense, Spain\\nis starting to develop a national seal of quality to certify\\nthe security and quality of AI technology used in Spain.\\nIn cooperation with industry, they will set up the technical\\ncriteria for companies to obtain this seal, and develop tools\\nto facilitate the certification process, e.g., developing self-\\nassessment software. Several companies will be open the\\npossibility to grant the seal, which will be voluntary for AI\\ncompanies to obtain. At the international level, one effort\\ntowards this end is the IEEE CertifAIEd program18 to as-\\nsess ethics of Autonomous Intelligent Systems via certifica-\\ntion guidance, assessment and independent verification. This\\nmark is meant for IEEE authorized assessors and certifiers\\n18IEEE CertifAIEd: https://engagestandards.ieee.org/ieeecertifaie\\nd.html, accessed on June 6th, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 20 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nPost-market monitoring (post-hoc)\\nTitle VIII: POST-MARKET MONITORING, INFORMATION SHARING, MARKET SURVEILLANCE\\n- Chapter 1: Post-market monitoring\\n- Chapter 2: Sharing of information on incidents and malfunctioning\\n- Chapter 3: Enforcement\\nTitle X: CONFIDENTIALITY AND PENALTIES\\nTime\\nPre-market monitoring (ex ante)\\nTitle III: HIGH-RISK AI SYSTEMS\\n- Chapter 3: Obligations of Users of High-Risk AI Systems and other Parties\\n- Chapter 4: Notifying Authorities and Notified Bodies\\n- Chapter 5: Standards, Conformity Assessment, Certificates, Registration\\nDEPLOYMENT\\nAI-based system\\nAudience\\nModel\\nVertical \\nregulation\\nEthical \\nprinciples\\nTechnical \\ndocumentation\\nSandbox\\nHorizontal \\nregulation\\nCertificates (Art. 44)\\nEU declaration of conformity (Art. 48)\\nCE marking of conformity (Art. 49) \\nDocument retention (Art. 50)\\nRegistration (Art. 51)\\nEU database for stand-alone high-\\nrisk AI systems (Art. 60)\\nTitle VI: Governance\\nChapter 4: European \\nArtificial Intelligence \\nBoard\\nStandards, \\nguidance, \\ngood practices\\n(Art. 58)\\nAUDITABILITY & CONFORMITY\\nMONITORING & ACCOUNTABILITY\\nExpert groups\\nCorrective \\nactions \\n(Art. 21)\\nAuthorities & \\nnotified bodies\\nQuality management\\nsystem (Art. 17)\\nPost-market\\nmonitoring plan \\n(Art. 61)\\nNational \\nsupervisory \\nauthority\\nAccess to data and \\ndocumentation\\n(Art. 64)\\nReporting of serious \\nincidents and of \\nmalfunctioning\\n(Art. 62)\\nEuropean \\nCommission\\nProcedure for dealing with AI \\nsystems presenting a risk at \\nnational level (Art. 65)\\nAI-based system\\nAudience \\n(end-users)\\nInteraction, \\npersonal data, …\\nControl & \\nconfiguration\\nLogs\\nUsage statistics\\nPerformance metrics\\nData collection\\nUnion safeguard \\nprocedure (Art. 66)\\nPenalties (Art. 71)\\nMarket surveillance and \\ncontrol of AI systems in the \\nUnion market (Art. 63)\\nTechnical \\ndocumentation\\n(Art. 11)\\nRecord-keeping\\n(Art. 12)\\nSubcontractors of \\nnotified bodies (Art. 34)\\nFigure 4: Diagram showing the role of sandboxes before (ex-ante) and after (post-hoc) the AI-based system has been deployed\\nin the market. Sandboxes permit to evaluate the conformity of the AI-based system w.r.t. technical specifications, horizontal\\n& vertical regulation, and ethical principles in a controlled and reliable testing environment. Once conformity has been verified,\\nsandboxes can be used to interface with the deployed AI-based asset via the established monitoring plan, so that information\\nabout its post-market functioning can be collected and processed. This information is used by the national supervisory authority\\nto evaluate the compliance: if needed, the authority asks for corrective actions and/or reports serious incidents/a continued lack\\nof compliance to the European Commission. Articles in the AI Act related to each step are cited in the diagram.\\nto perform an independent review and verification to grant\\na mark and certificate based on ontological specifications\\nfor Ethical Privacy, Algorithmic Bias, Transparency, and\\nAccountability.\\nWe expect that the first experiences and results of run-\\nning regulatory sandboxes and their alignment with certi-\\nfication activities will permit to learn lessons, to improve\\nAI systems and eventually, to support the progressive pro-\\nliferation of responsible AI systems deployed in practical\\nscenarios. We believe that sandbox assessment should be\\nperiodically performed by independent and impartial assess-\\nment bodies to certificate and audit AI systems during their\\nlifetime.\\n6.4. Practical case study in Artificial Intelligence\\nfor healthcare\\nAt the time of writing (April 2023), the AI Act regulation\\ndraft is constantly being updated through different amend-\\nments, due in part to new versions of AI products accessible\\nto the general public. Concerned with how essential is the\\ntranslation of principles and regulation into specific pro-\\ncesses, it becomes necessary to have blueprint models and\\nprotocols that serve to assess how trustworthy AI systems\\nare.\\nThe blueprint for Trustworthy AI Implementation Guid-\\nance and Assurance for Healthcare is one step taken in this\\ndirection. Figure 5 shows the proposal by the coalition for\\nhealth AI [135], based on collaboration, guiding principles\\nand leadership actions. It is aligned with the AI risk man-\\nagement framework from the National Institute of Standards\\nand Technology (NIST).\\nIn particular, to ensure trustworthiness this framework\\ndescribes four key functions [135]:\\n• Mapping the framing AI risks,\\n• Measuring quantitatively and qualitatively these risks and\\nimpacts,\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 21 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n• Collaboration\\n• Guiding principles\\n• Leadership\\nIntegrated Data \\nInfrastructure for \\nDiscovery, Evaluation and \\nAssurance\\nHealth System \\nPreparedness & \\nAssessment\\nAI Tools, \\nTrustworthiness and \\nTransparency across \\nthe Lifecycle\\nFigure 5: The Coalition for Health AI (CHAI) Blueprint\\nfor Trustworthy AI implementation guidance and assurance\\n[135]. It considers obligations on reliability and testability,\\ntransparency, mitigation of biases particular to the domain,\\nprivacy, security and resilience, among other aspects.\\n• Managing the allocation of risk resources, and a cross-\\ncutting,\\n• Governance via risk management.\\nBased on these functions, they define values to set the\\nkey elements of trustworthy AI in healthcare [135]:\\n1. Useful: valid and reliable, testable, usable and beneficial.\\nThese values are closely linked to social wellbeing (Re-\\nquirement 6, Subsection 5.7) and auditability (Require-\\nment 7, Subsection 5.8).\\n2. Safe, which is related to technical robustness and safety\\n(Requirement 2, Subsection 5.3).\\n3. Accountable and transparent, with clear connections\\nto accountability (Requirement 7, Subsection 5.8) and\\ntransparency (Requirement 4, Subsection 5.5).\\n4. Explainable and interpretable, echoing several dimen-\\nsions of the transparency requirement.\\n5. Fair with harmful bias managed (systemic bias, com-\\nputational and statistical biases and human-cognitive bi-\\nases). The consideration of fairness and the focus on\\nthe management of consequences of harmful biases are\\npresent in requirement 5 (Diversity, non-discrimination\\nand fairness, Subsection 5.6), particularly in the avoid-\\nance of unfair biases. Furthermore, requirement 7 (ac-\\ncountability, Subsection 5.8) also regards the minimiza-\\ntion of negative impacts, either due to harmful biases or\\nto other consequences.\\n6. Secure and resilient, which relate to the dimension of\\ntraceability (requirement 4, Subsection 5.5) and technical\\nrobustness (Requirement 2, Subsection 5.3).\\n7. Privacy-enhanced, which is coupled with requirement 3\\n– Privacy and data governance (Subsection 5.4).\\nIt is important to underscore that some dimensions of\\nthe requirements for trustworthy AI discussed in Section\\n5 are not reflected (at least, explicitly) in the above list of\\nvalues, e.g. environmental wellbeing or reproducibility. This\\nresounds with our formulated definition of a responsible\\nAI system, showing that a system as such, depending on\\nits domain of application, may require different degrees of\\ncompliance with the requirements for trustworthiness.\\nThe blueprint analyzed in [135] recognizes the difficul-\\nties on building ecosystems when multiple guidelines are left\\nout in the wild without a standardization consensus. It calls\\nfor mapping socio-technical scenarios to resolve tensions\\namong principles, an ISO-based approach to professional\\nresponsibility, and institutionalizing trustworthy AI Systems\\n(that is, responsible AI systems).\\nAs a follow-up of the blueprint, the coalition for health\\nAI [135] suggests:\\n• Setting up an engaged assurance lab and advisory service\\nindependent infrastructure.\\n• Institutionalizing trustworthy AI systems (responsible AI\\nsystems).\\n• Promoting a Coalition of the Willing through interesting\\nstrategies that can be applied in health AI to drive a\\npositive change.\\nOther practical frameworks exist. They count with strate-\\ngies to implement ethics and the governance of AI systems in\\nhealth to separate the factors affecting trustworthy medical\\nAI into design (data and algorithm aspects) and application.\\nThis is done through controls strategies [136] at both design\\nand application phases. First, the ethical governance system\\ndeparts from social needs and ethical values, which lead\\nto ethical principles to be enforced at the research stage.\\nAfter that, those principles guide the ethical norms that allow\\nperforming risk assessment, and later make the law and\\nregulation concrete. In particular, the framework in [137]\\naims at avoiding situations that can have dire consequences\\nfor patients. For instance, integrating the applied ethics Z-\\nInspection® [138] process to map and assess tensions in\\nsocio-technical scenarios in trustworthy AI. Another pro-\\nposal [139] formulates ten commandments (i.e., high-level\\nethical principles) that should be met by medical AI-based\\nsystems. Such commandments are formulated without the\\ntheoretical aspects underneath for the sake of an easier\\nunderstanding and verification of all stakeholders involved\\nin this domain. European fundamental rights also provide\\nlegal and ethical guidelines for the adoption, development\\nand application of medical AI [140]. These strategies and the\\nblueprint are advancing the research in the area, and results\\nwill be extensible to other domains to attain trustworthy AI.\\nDespite the clear establishment of the seven require-\\nments for HRAIs within the AI Act described in Section\\n6.2, the particular implementation steps to be taken within a\\nparticular area of application often remain under-specified.\\nIt becomes evident that the AI-based system is stringently\\ndependent on the sector of application, as well as on the\\ncoupling of the requirements for trustworthy AI to justify\\nexisting regulations and standards. Therefore, for a given\\ndomain of practice, an overarching consideration of the\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 22 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\ncomplete scenario is needed from the Trustworthy AI prac-\\ntical point of view. The field needs to further specify legal\\nrequirements, risk assessment tools for the ethical impact\\nof the AI system, data privacy and data governance mod-\\nels, ad-hoc risk management systems and conformity as-\\nsessments, and rest of essential elements evaluated in the\\nregulatory sandboxes testing the scenario. This may also\\nspur the emergence of generic trustworthy AI frameworks\\n(regulatory sandboxes) that can be potentially adapted to\\ndifferent domains, as it is discussed in [141].\\n6.5. Urgent needs for emerging AI systems,\\ndynamic regulation, and evaluation protocols\\nThe widespread use and repercussion of the achieve-\\nments of emerging AI systems, such as GPAIS or neuro-\\nscience technology, have brought to the public arena the\\npotentials and implications of new high-risk scenarios sup-\\nported by these technological advances. In this section we\\ndiscuss potential issues to be tackled to regulate new HRAIs\\nas well as future emerging AI systems. We discuss and argue\\nthat regulation should be dynamic and malleable to establish\\nthe boundaries of new high-risk scenarios supported by\\ntechnological AI advances. Likewise, we also highlight the\\nneed for flexible evaluation procedures that can be adapted\\nin an agile way to cope with the fast evolution of AI systems.\\nIndeed, the rapid pace at which AI evolves over time can\\nunexpectedly give rise to new high-risk scenarios beyond\\nthose defined by regulation, such as the AI Act (Section\\n6.2). This requires regulatory protocols to cope with new\\nemerging applications. In the case of the European AI Act,\\non 11th May 2023, MEPs endorsed new transparency and\\nrisk-management rules for AI systems19. MEPs expanded\\nthe classification of high-risk areas to include those that\\ncould compromise or harm people’s health, safety, funda-\\nmental rights or the environment. Such revised classification\\nof high-risk scenarios also considered AI systems used to\\ninfluence voters in political campaigns, as well as recom-\\nmender systems (with more than 45 million users) utilized\\nby social media platforms. Intrusive and discriminatory uses\\nof AI-based biometric systems have been also identified as\\nprohibited AI systems, such as:\\n• “Real-time” remote biometric identification systems in\\npublicly accessible spaces;\\n• “Post” remote biometric identification systems, with the\\nonly exception of law enforcement for the prosecution of\\nserious crimes and only after judicial authorization;\\n• Biometric categorization systems using sensitive charac-\\nteristics (e.g. gender, race, ethnicity, citizenship status,\\nreligion, political orientation);\\n• Predictive policing systems (based on profiling, location\\nor past criminal behavior);\\n19AI Act: a step closer to the first rules on Artificial Intelligence, https:\\n//www.europarl.europa.eu/news/en/press-room/20230505IPR84904/ai-act-a\\n-step-closer-to-the-first-rules-on-artificial-intelligence, accessed\\non June 6th, 2023\\n• Emotion recognition systems in law enforcement, border\\nmanagement, workplace, and educational institutions; and\\n• Indiscriminate scraping of biometric data from social me-\\ndia or CCTV footage to create facial recognition databases\\n(violating human rights and right to privacy).\\nIn this revision of the AI Act, MEPs have also proposed\\ntailored regulatory regimes for new and fast-evolving devel-\\nopments in the field of AI and GPAIS. Since GPAIS are\\nsystems that have a wide range of possible uses without\\nsubstantial modification and fine-tuning, generative foun-\\ndation models are examples of rapidly evolving areas for\\nwhich, if regulation is not set in place, consequences may\\nbe hard to revert. Such systems must guarantee robust the\\nprotection of fundamental rights, health and safety and the\\nenvironment, democracy and rule of law. To this end, such\\nemerging AI systems must assess and mitigate risks, comply\\nwith design, information and environmental requirements,\\nand be registered in the EU database. Furthermore, addi-\\ntional transparency requirements have been demanded for\\ngenerative foundation models such as GPT: they must inform\\nthat the content is generated by an AI model, the model\\nmust be designed to avoid generating illegal content and\\npublishing summaries or copyrighted content used during\\ntraining. Jurisdiction at national level will also need to be\\nadapted to different considerations demanded by different\\nsectors, e.g., the public sector or labor sector.\\nAnother area in which regulation and technology yet\\nhave to advance is in copyright management of generated\\nartwork produced by fundation models. Although the AI\\nAct requires to disclose the use of copyrighted material in\\nthe training data, there is no current way to detect when AI\\ngenerated content may be directly related to existing content\\nprotected by copyright, nor it is clear who owns the intellec-\\ntual property of generative models outputs [142, 143].\\nBesides GPAIS, other emerging AI-based technologies\\nalso require specialized adjustments of ongoing regulatory\\nefforts. This is the case of neurotechnology, such as brain\\ninterfaces. The needs to handle novel applications never\\nused before become evident by recent research [144] that\\nshows the potential of “mind-reading” [145]. For instance,\\nthe study in [144] shows the potential of leveraging lan-\\nguage models as an autoregressive prior to generate novel\\nsequences that can decode structured sequential information\\nin the form of text from brain signals. Although the study of\\nhuman imagination decoding shows human cooperation is\\nrequired for the approach to work, this may not be a requisite\\nin the future. Even if decoding is not accurate yet, these\\nsystems could be used maliciously.\\nThese recent results attained by neurotechnology call for\\nraising awareness about the risks posed by brain decoding\\ntechnology, and for the design of regulation and policies to\\npreserve fundamental rights such as mental privacy. A role\\nmodel in this direction is the novel neurorights regulation\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 23 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\npioneered by Chile20. Their neuroprotection agenda (which\\nis closely followed up by Spain) leads the way to the reg-\\nulation of brain technology and legislation of advances in\\nAI-supported medicine and science from a human rights’\\npoint of view. This is of utmost importance to avoid mental\\nmanipulation, as mental diseases are the grand pandemic\\nof XXI century. If used appropriately, AI based technology\\napplied to the diagnosis and treatment of mental diseases has\\na great potential to improve patients’ quality of life.\\nIn order for regulation to evolve synchronously with\\ntechnological advances (and vice versa), evaluation proce-\\ndures must be flexible and dynamically adaptable to accom-\\nmodate the implications of these advances over time. More\\ncomprehensive evaluation tools are required to be adopted\\nby the AI community of practitioners and users if we aspire\\nto synergistic solutions that can complement governmental\\nefforts. In particular, ML and AI model evaluation is nor-\\nmally based on leader-board benchmarks that do not always\\nreflect reality, and may have a detrimental effect when they\\nare not faithful to reality. A more accountable evaluation\\nshould consider aggregated metrics. Improperly constructed\\nbenchmarks may, for instance, reflect unrealistic overesti-\\nmation of the capabilities of a model when predicting over\\nminority classes. This may lead to hazards that end up\\nobfuscating the real benefits of AI.\\nTo avoid these issues, important guidelines for robust\\nevaluation practices [146] include:\\n1. Granular performance reporting protocols with break-\\ndowns across the features that have demonstrated affect-\\ning performance.\\n2. Designing benchmarks to test capabilities and to signif-\\nicantly vary on important features of the problem space\\nand labeling instances to allow for granular analyses (e.g.\\nas the Holistic Evaluation of Language Models bench-\\nmark [147]).\\n3. Record all results, successful or failing, partial or fully\\ntraced, in supplementary material or public repositories\\nfor each run and validation split separately (e.g., in med-\\nical AI [148]).\\n4. Enable researchers follow-up instance-level analyses by\\nincluding data labels and annotations of those instances.\\nHowever, prospective evaluation methodologies as the\\none described above should be versatile and extensible to\\nembrace and incorporate new performance metrics, evalu-\\nation protocols or even modeling tasks proposed along the\\nyears. But most importantly: their sought flexibility should\\nnot give rise to exceptional cases that would undermine the\\nvalidity and applicability of regulations in force.\\nWe conclude that given the fast pace at which AI is\\nprogressing in the last months, it is of paramount importance\\nto have a dynamic regulation from a double perspective:\\n20NeuroRights Foundation, https://neurorightsfoundation.org/,acces\\nsedonJune06th,2023., which has taken a step towards the first Neurorights\\nlaw in this country\\nthe appearance of risk-based scenarios and the emergence\\nof novel AI systems. Only in this way the regulation will\\nfacilitate the realization of responsible AI systems, in par-\\nallel to the development of methodologies for algorithmic\\nauditing and the clearance of responsibilities in the use of\\nsuch systems.\\n7. From the Artificial Intelligence moratorium\\nletter to regulation as the key for consensus\\nAt the time of writing, a global debate is held around\\nthe moratorium letter published by several renowned re-\\nsearchers calling for a pause in large AI experimentation21.\\nThe letter can be interpreted as a contribution to pointing\\nout the gap between the fast advance of high-powered AI\\nsystems and the regulation. The letter also highlights that:\\n“AI research and development should be refo-\\ncused on making today’s powerful, state-of-the-\\nart systems more accurate, safe, interpretable,\\ntransparent, robust, aligned, trustworthy, and\\nloyal.”\\nFollowing up this moratorium letter, several declarations\\nand written statements by reputed experts have been pub-\\nlished to approach the AI conundrum between ethics, regula-\\ntion and technological progress from different perspectives.\\nAmong them, we highlight the interview with G. Hinton22,\\nin which he states that “We need to find a way to control\\nartificial intelligence before it’s too late”. Interestingly un-\\nder the scope of this work, he has also underscored the\\ndifferent nature of intelligent systems when compared to\\nhuman intelligence, and thereby the need for establishing\\nregulation for these artificial systems:\\n“Our brains are the result of evolution and\\nhave a series of integrated goals — such as not\\nhurting the body, hence the notion of damage;\\neating enough, hence the notion of hunger. Mak-\\ning as many copies of ourselves as possible,\\nhence the sexual desire. Synthetic intelligence,\\non the other hand, hasn’t evolved: we’ve built\\nit. Therefore, it doesn’t necessarily come with\\ninnate goals. So, the big question is, can we\\nmake sure that AI has goals that benefit us?\\nThis is the so-called alignment problem. And we\\nhave several reasons to be very concerned.”\\nA similar line of thinking has been expressed by Harari23,\\nemphasizing on the pressing immediacy at which regulation\\n21Future of Life Institute, Pause giant AI experiments: An open letter,\\nhttps://futureoflife.org/open-letter/pause-giant-ai-experiments/,\\naccessed on April 25th, 2023\\n22Geoffrey Hinton: \\\"We need to find a way to control artificial intelli-\\ngence before it’s too late\\\", https://english.elpais.com/science-tech/2023\\n-05-12/geoffrey-hinton-we-need-to-find-a-way-to-control-artificial-i\\nntelligence-before-its-too-late.html, accessed on June 4th, 2023\\n23Yuval Noah Harari argues that AI has hacked the operating system of\\nhuman civilisation, https://www.economist.com/by-invitation/2023/04/28\\n/yuval-noah-harari-argues-that-ai-has-hacked-the-operating-system-o\\nf-human-civilisation, accessed on June 4th, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 24 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nis needed to match the speed of AI technological develop-\\nment with the public use of AI systems:\\n“We can still regulate the new AI tools, but\\nwe must act quickly. Whereas nukes cannot\\ninvent more powerful nukes, AI can make ex-\\nponentially more powerful AI. The first crucial\\nstep is to demand rigorous safety checks before\\npowerful AI tools are released into the pub-\\nlic domain. Just as a pharmaceutical company\\ncannot release new drugs before testing both\\ntheir short-term and long-term side-effects, so\\ntech companies shouldn’t release new AI tools\\nbefore they are made safe. We need an equiv-\\nalent of the Food and Drug Administration for\\nnew technology, and we need it yesterday.”\\nAnother example is the proposal made by Sam Altman\\n(OpenAI co-founder) before the US Senate to defend the\\nbenefits of this revolutionary technology, claiming that AI\\nregulation should ensure that the public has access to its\\nmany advantages24:\\n“[...] we are not alone in developing this tech-\\nnology. It will be important for policymakers\\nto consider how to implement licensing regula-\\ntions on a global scale and ensure international\\ncooperation on AI safety, including examining\\npotential intergovernmental oversight mecha-\\nnisms and standard-setting.”\\nFinally, the manifesto on AI risk supported by multiple\\nscientists and notable figures in the AI landscape has claimed\\nto elevate the mitigation of AI risks to the priority levels of\\nother humanity-threatening fatalities25:\\n“Mitigating the risk of extinction from AI should\\nbe a global priority alongside other societal-\\nscale risks such as pandemics and nuclear\\nwar.”\\nThe controversy held around these and other statements\\nis whether humanity is close to or far from the moment at\\nwhich AI can pose a realistic threat to its own existence. Un-\\nquestionably, triggering this debate has ignited even further\\nthe need for ethical and regulatory frameworks that regulate\\nwhether and how AI-based systems can be trusted and used\\nin practical setups.\\nAfter this latter manifesto, the Center for AI Safety\\nhas just published a new document entitled Existing policy\\nproposals aimed at present and future harms’26. The aim of\\nthis one-page document is to describe three proposals that, in\\nour view, promote AI safety. We follow with a short analysis\\nof these proposals:\\n24Written Testimony of Sam Altman Chief Executive Officer OpenAI,\\nhttps://www.washingtonpost.com/documents/0668f6f4-d957-4b94-a745-2aa\\n9617d1d60.pdf?itid=lk_inline_manual_18, accessed on June 4th, 2023.\\n25Statement on AI Risk, https://www.safe.ai/statement-on-ai-risk,\\naccessed on June 04th, 2023.\\n26Existing Policy Proposals Targeting Present and Future Harms, https:\\n//https://www.safe.ai/post/three-policy-proposals-for-ai-safety,\\naccessed on June 07th, 2023.\\n• Legal Liability for AI harms: The first issue highlighted in\\nthe document is the need for establishing improved legal\\nliability frameworks for the accountability of damages\\ncaused by the application of AI systems. GPAIS are also\\nreferred in the document for the ill-advised implemen-\\ntation of legal exemptions to absolve GPAIS developers\\nof liability, as such exemptions could unfairly shift the\\nentire burden of responsibility from large corporations\\nto smaller actors, users and communities lacking the\\nnecessary resources, access, and capabilities to effectively\\naddress and alleviate all risks.\\n• Increased regulatory scrutiny: The second problem em-\\nphasized in this document is the need for a greater regu-\\nlatory inspection during the development of AI systems,\\nextending beyond the application layer to encompass the\\nentire product lifecycle. It underscores the importance of\\nholding companies responsible for the data and design\\nchoices they make when developing these models. In line\\nwith this proposal, increased transparency and regula-\\ntions over training data are crucial to address algorithmic\\nbias effectively, and to prevent companies from unfairly\\nleveraging copyrighted materials through data modeling\\nwithout compensating their creators.\\n• Human supervision of automated systems: The third\\ntheme in the document is the importance of human over-\\nsight in the implementation of HRAIs. Human oversight\\ncan contribute to lessening potential concerns with bias\\nand the propagation of false or misleading information\\nthrough AI systems. An explicit reference is done to the\\nEU’s regulatory proposal, with a positive emphasis on the\\nimportance therein granted to the human oversight in the\\ndeployment of HRAIs.\\nOur position, as we put it in this manuscript, is that\\n\\\"regulation is a key for consensus\\\" among these diverging\\nvoices to cast light over the shadows of modern AI technolo-\\ngies. For this to occur, technologies, methodologies and tools\\nsupporting the development, auditability and accountability\\nof responsible AI systems are of utmost importance to cope\\nwith high-risk scenarios and to meet regulatory constraints.\\nTo finish this section, we pay attention to a final point\\nmade by the authors of the paper [15]. Unfortunately, this\\nroad towards consensus is not exempt of their own risks.\\nIndeed, conflating trust and trustworthiness with the accept-\\nability of risks blurs the distinction between acceptability\\njudgments made by domain experts and the trustworthiness\\nof AI systems implemented in society [15]. It has been\\nargued that trust is improbable to be produced on demand\\nand impossible on command, as “trust engineering” may\\nbackfire and not achieve its goal. Focused on trust and\\ntrustworthiness in AI in the public sector, [15] argues on\\nthe four acute challenges facing the European Commission’s\\nattempt to signal the trustworthiness of AI through its pro-\\nposed regulation: the uncertainty about the antecedents of\\nperceived trust in public institutions that utilize AI; the threat\\nof misalignment between trustworthiness and degrees of\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 25 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\ntrust; concealed behavioral factors behind the acceptability\\nof risks; and the need for impartial intermediaries.\\nDespite these and other curves in the road, regulation can\\nbe an unquestionable driving force to consolidate and put\\nall these diverging voices on the same page. Regulation has\\nfavored consensus about the benefits and restrictions of tech-\\nnological advances that have evolved faster than expected,\\npermeating quickly into the society (e.g., social networks,\\nInternet or mobile communications). AI should not be an\\nexception. There is still a long way to go before we have\\nfully aligned AI technology and regulation, developing re-\\nsponsible AI systems adapted to each risk scenario and fully\\nleveraging the latest advances in the field. For this to occur,\\nthe European regulatory model based on risk-based use case\\nscenarios can serve as a guiding light for the maturity and\\nimplementation of ethical, legal and technical frameworks,\\nfostering the creation of industrial and institutional instru-\\nments (e.g. AI sandboxes or AI ethics board [149]) that\\nguarantee that AI-based products and services comply with\\ntheir requirements.\\n8. Concluding remarks\\nFor years now, the ever-growing capabilities of AI-\\npowered systems have stimulated debates about the impact,\\nbenefits, implications and risks brought by AI systems to\\nthe industry and society. The ground-breaking potential of\\nlarge generative AI models such as ChatGPT and GPT4 has\\nreinvigorated this debate, since their near general-purpose\\ncapabilities learned from multimodal data can support a\\nwide variety of intended and unintended purposes and\\ntasks, by generating content that is hardly distinguishable\\nfrom that made by humans. This notorious advance has\\nreinvigorated the relevance and momentum of trustworthy\\nAI systems, particularly in what refers to 1) the ethical usage\\nof these models, and 2) the need for regulatory directives that\\nestablish what, when and how AI systems can be adopted in\\npractical applications.\\nIn this context, this manuscript has shed light on the\\nprinciples, pillars and requirements to be met by trustworthy\\nAI systems to be considered as such. To this end, we have de-\\nparted from mature regulation/supervisory frameworks de-\\nveloped around trustworthy AI (e.g. AI Act) to provide clear\\ndefinitions of all related concepts, placing emphasis on what\\neach requirement for trustworthiness in AI stands for, why\\nthey contribute to generating trust in the user of an AI-based\\nsystem, and how such requirements can be met technically.\\nRegarding the latter, a short tour over technological areas\\nthat can contribute to each of these requirements has been\\noffered. Our study has also overviewed ethical principles for\\nthe development of AI, which establish an overarching set\\nof recommendations that ensure that this discipline will be\\nadvanced under social and ethical standards. The study has\\nbeen complemented by a discussion on practical aspects to\\nbe considered in the design, development and use of trust-\\nworthy AI systems, stressing on the importance of assessing\\ntheir conformity to regulations (auditability) and explaining\\nhow their decisions are issued (accountability). These two\\npractical aspects must be met by responsible AI systems.\\nFurther along this line, accountability and explainability\\nhave permeated deeply into the recommendations recently\\nissued for the development of trustworthy medical AI, a risk-\\ncritical sector in large demand for trust when embracing new\\ntechnological advances. Our analysis of such recommenda-\\ntions has exposed that auditability and accountability are at\\nthe core of the guidelines proposed in this area; together\\nwith ethics, data governance and transparency. Medical AI\\nexemplifies the paramount relevance of considering all these\\nrequirements for trustworthiness along the entire AI cycle.\\nFor a given domain of practice, we need to assess the\\ncomplete scenario from the Trustworthy AI practical\\npoint of view, that is, all essential elements audited\\nin regulatory sandboxes for scenario testing, together\\nwith clear accountability protocols. Above all, the\\ndevelopment of responsible AI systems as the final\\noutput of the chain is essential and must be the goal\\nfor current AI designs and developments.\\nIn summary, we hope that this paper serves as a reference\\nfor researchers, practitioners and neophytes who are new to\\nthe world of AI, with interest in trustworthy AI from a holis-\\ntic perspective. A well-rounded analysis of what trust means\\nin AI-based systems and its requirements as the one offered\\nin this manuscript is a key for the design and development\\nof responsible AI systems throughout their life cycle. We\\nshould not regulate scientific progress, but rather products\\nand its usage. As we emphasize in this paper, regulation\\nis the key for consensus, and for this purpose, trustworthy\\nAI and responsible AI systems for high risk scenarios are\\nimperative, as they will contribute to the convergence be-\\ntween technology and regulation, the advance of science,\\nthe prosperity of our economies, and the good of humanity,\\nsubject to legal requirements and ethical principles.\\n9. Acknowledgments\\nN. Díaz-Rodríguez is currently supported by a Marie\\nSkłodowska-Curie Actions (MSCA) Postdoctoral Fellow-\\nship with agreement ID: 101059332 and the Leonardo\\nScholarship for Researchers and Cultural Creators 2022\\nfrom the BBVA Foundation. J. Del Ser has received funding\\nsupport from the Spanish Centro para el Desarrollo Tec-\\nnológico Industrial (CDTI) through the AI4ES project, and\\nfrom the Basque Government (Eusko Jaurlaritza) through\\nthe Consolidated Research Group MATHMODE (IT1456-\\n22). F. Herrera has received funding support from the\\nSpanish Ministry of Science and Innovation (grant PID2020-\\n119478GB-I00).\\nDeclaration of competing interest\\nThe authors declare that they have no known competing\\nfinancial interests or personal relationships that could have\\nappeared to influence the work reported in this paper.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 26 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nReferences\\n[1] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford,\\nM. Chen, I. Sutskever, Zero-shot text-to-image generation, in: Inter-\\nnational Conference on Machine Learning, PMLR, 2021, pp. 8821–\\n8831.\\n[2] C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton,\\nK. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans,\\nJ. Ho, D. J. Fleet, M. Norouzi, Photorealistic text-to-image diffusion\\nmodels with deep language understanding, in: S. Koyejo, S. Mo-\\nhamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh (Eds.), Advances in\\nNeural Information Processing Systems, Vol. 35, Curran Associates,\\nInc., 2022, pp. 36479–36494.\\nURL https://proceedings.neurips.cc/paper_files/paper/2022/file\\n/ec795aeadae0b7d230fa35cbaf04c041-Paper-Conference.pdf\\n[3] European Commission High-Level Expert Group on AI, Ethics\\nguidelines for trustworthy AI (2019).\\n[4] European Union, Proposal for a Regulation of the European Parlia-\\nment and of the Council Laying down harmonised rules on Artificial\\nIntelligence (Artificial Intelligence Act) and amending certain Union\\nLegislative Acts. COM/2021/206 final (2021).\\n[5] UNESCO, Recommendation on the ethics of artificial intelligence,\\nDigital Library UNESDOC (2020).\\nURL en.unesco.org\\n[6] R. Benjamins, A. Barbado, D. Sierra, Responsible AI by design in\\npractice, in: Proceedings of the Human-Centered AI: Trustworthi-\\nness of AI Models & Data (HAI) track at AAAI Fall Symposium,\\n2019.\\n[7] G. Pisoni, N. Díaz-Rodríguez, H. Gijlers, L. Tonolli, Human-\\ncentered artificial intelligence for designing accessible cultural her-\\nitage, Applied Sciences 11 (2) (2021) 870.\\n[8] B. C. Stahl, D. Wright, Ethics and privacy in AI and big data:\\nImplementing responsible research and innovation, IEEE Security\\n& Privacy 16 (3) (2018) 26–33.\\n[9] M. Coeckelbergh, AI ethics, MIT Press, 2020.\\n[10] M. Coeckelbergh, Artificial intelligence, responsibility attribution,\\nand a relational justification of explainability, Science and engineer-\\ning ethics 26 (4) (2020) 2051–2068.\\n[11] W. Wahlster, C. Winterhalter, German standardization roadmap on\\nartificial intelligence, DIN/DKE, Berlin/Frankfurt (2020) 100.\\n[12] L. Edwards, The EU AI Act: a summary of its significance and scope,\\nAda Lovelace Institute, Expert explainer Report (2022) 26.\\n[13] S. Campos, R. Laurent, A Definition of General-Purpose AI Sys-\\ntems: Mitigating Risks from the Most Generally Capable Models,\\nAvailable at SSRN 4423706 (2023).\\n[14] M. Estévez Almenzar, D. Fernández Llorca, E. Gómez, F. Mar-\\ntinez Plumed, Glossary of human-centric artificial intelligence,\\nTech. Rep. JRC129614, Joint Research Centre (2022).\\n[15] J. Laux, S. Wachter, B. Mittelstadt, Trustworthy artificial intelli-\\ngence and the European Union AI act: On the conflation of trustwor-\\nthiness and acceptability of risk, Regulation & Governance n/a (n/a).\\narXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/rego.125\\n12, doi:https://doi.org/10.1111/rego.12512.\\nURL https://onlinelibrary.wiley.com/doi/abs/10.1111/rego.12512\\n[16] E. Tjoa, C. Guan, A survey on explainable artificial intelligence\\n(XAI): Toward medical XAI, IEEE Transactions on Neural Net-\\nworks and Learning Systems 32 (11) (2020) 4793–4813.\\n[17] D. Doran, S. Schulz, T. R. Besold, What does explainable AI really\\nmean? A new conceptualization of perspectives, arXiv preprint\\narXiv:1710.00794 (2017).\\n[18] Z. C. Lipton, The mythos of model interpretability: In machine learn-\\ning, the concept of interpretability is both important and slippery,\\nQueue 16 (3) (2018) 31–57.\\n[19] European Commission High-Level Expert Group on AI, The Assess-\\nment List for Trustworthy Artificial Intelligence (ALTAI) for self\\nassessment (2020).\\n[20] C. Widmer, M. K. Sarker, S. Nadella, J. Fiechter, I. Juvina,\\nB. Minnery, P. Hitzler, J. Schwartz, M. Raymer, Towards Human-\\nCompatible XAI: Explaining Data Differentials with Concept Induc-\\ntion over Background Knowledge, arXiv preprint arXiv:2209.13710\\n(2022).\\n[21] B. Lepri, N. Oliver, A. Pentland, Ethical machines: the human-\\ncentric use of artificial intelligence, Iscience (2021) 102249.\\n[22] G. Pisoni, N. Díaz-Rodríguez, Responsible and human centric AI-\\nbased insurance advisors, Information Processing & Management\\n60 (3) (2023) 103273.\\n[23] N. Tomašev, J. Cornebise, F. Hutter, S. Mohamed, A. Picciariello,\\nB. Connelly, D. C. Belgrave, D. Ezer, F. C. v. d. Haert, F. Mugisha,\\net al., Ai for social good: unlocking the opportunity for positive\\nimpact, Nature Communications 11 (1) (2020) 2468.\\n[24] A. Holzinger, Interactive machine learning for health informatics:\\nwhen do we need the human-in-the-loop?, Brain Informatics 3 (2)\\n(2016) 119–131.\\n[25] World Economic Forum, Empowering AI leadership an oversight\\ntoolkit for boards of directors, Tech. rep. (2019).\\n[26] World Economic Forum, Empowering AI Leadership: AI C-Suite\\nToolkit , Tech. rep. (2022).\\n[27] E. Cambria, L. Malandri, F. Mercorio, M. Mezzanzanica, N. Nobani,\\nA survey on XAI and natural language explanations, Information\\nProcessing & Management 60 (1) (2023) 103111.\\n[28] L. Floridi, Establishing the rules for building trustworthy AI, Nature\\nMachine Intelligence 1 (6) (2019) 261–262.\\n[29] R. Mariani, F. Rossi, R. Cucchiara, M. Pavone, B. Simkin, A. Koene,\\nJ. Papenbrock, Trustworthy AI – Part 1, Computer 56 (2) (2023) 14–\\n18.\\n[30] P.-Y. Chen, P. Das, AI Maintenance: A Robustness Perspective,\\nComputer 56 (2) (2023) 48–56.\\n[31] K. R. Varshney, Trustworthy machine learning and artificial intelli-\\ngence, XRDS: Crossroads, The ACM Magazine for Students 25 (3)\\n(2019) 26–29.\\n[32] J. Yang, K. Zhou, Y. Li, Z. Liu, Generalized out-of-distribution\\ndetection: A survey, arXiv preprint arXiv:2110.11334 (2021).\\n[33] A. Ruospo, E. Sanchez, L. M. Luza, L. Dilillo, M. Traiola, A. Bosio,\\nA survey on deep learning resilience assessment methodologies,\\nComputer 56 (2) (2023) 57–66.\\n[34] S. Speakman, G. A. Tadesse, C. Cintas, W. Ogallo, T. Akumu,\\nA. Oshingbesan, Detecting systematic deviations in data and models,\\nComputer 56 (2) (2023) 82–92.\\n[35] T. Lesort, V. Lomonaco, A. Stoian, D. Maltoni, D. Filliat, N. Díaz-\\nRodríguez, Continual learning for robotics: Definition, framework,\\nlearning strategies, opportunities and challenges, Information fusion\\n58 (2020) 52–68.\\n[36] M. Abdar, F. Pourpanah, S. Hussain, D. Rezazadegan, L. Liu,\\nM. Ghavamzadeh, P. Fieguth, X. Cao, A. Khosravi, U. R. Acharya,\\net al., A review of uncertainty quantification in deep learning: Tech-\\nniques, applications and challenges, Information Fusion 76 (2021)\\n243–297.\\n[37] J. Parmar, S. Chouhan, V. Raychoudhury, S. Rathore, Open-world\\nmachine learning: applications, challenges, and opportunities, ACM\\nComputing Surveys 55 (10) (2023) 1–37.\\n[38] R. S. Zimmermann, W. Brendel, F. Tramer, N. Carlini, Increasing\\nconfidence in adversarial robustness evaluations, in: A. H. Oh,\\nA. Agarwal, D. Belgrave, K. Cho (Eds.), Advances in Neural Infor-\\nmation Processing Systems, 2022.\\nURL https://openreview.net/forum?id=NkK4i91VWp\\n[39] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schul-\\nman, D. Mané, Concrete problems in AI safety, arXiv preprint\\narXiv:1606.06565 (2016).\\n[40] D. Hendrycks, N. Carlini, J. Schulman, J. Steinhardt, Unsolved\\nproblems in ml safety, arXiv preprint arXiv:2109.13916 (2021).\\n[41] S. Mohseni, H. Wang, C. Xiao, Z. Yu, Z. Wang, J. Yadawa, Tax-\\nonomy of machine learning safety: A survey and primer, ACM\\nComputing Surveys 55 (8) (2022) 1–38.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 27 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n[42] T. Gu, K. Liu, B. Dolan-Gavitt, S. Garg, Badnets: Evaluating back-\\ndooring attacks on deep neural networks, IEEE Access 7 (2019)\\n47230–47244.\\n[43] D. Hendrycks, C. Burns, S. Basart, A. Critch, J. Li, D. Song,\\nJ. Steinhardt, Aligning AI with shared human values, Proceedings\\nof the International Conference on Learning Representations (ICLR)\\n(2021).\\n[44] C. O’neil, Weapons of math destruction: How big data increases\\ninequality and threatens democracy, Crown, 2017.\\n[45] R. B. Parikh, S. Teeple, A. S. Navathe, Addressing bias in artificial\\nintelligence in health care, Jama 322 (24) (2019) 2377–2378.\\n[46] K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman,\\nV. Ivanov, C. Kiddon, J. Konečn`y, S. Mazzocchi, B. McMahan, et al.,\\nTowards federated learning at scale: System design, Proceedings of\\nMachine Learning and Systems 1 (2019) 374–388.\\n[47] N. Rodríguez-Barroso, G. Stipcich, D. Jiménez-López, J. A. Ruiz-\\nMillán, E. Martínez-Cámara, G. González-Seco, M. V. Luzón, M. A.\\nVeganzones, F. Herrera, Federated learning and differential privacy:\\nSoftware tools analysis, the Sherpa.ai FL framework and method-\\nological guidelines for preserving data privacy, Information Fusion\\n64 (2020) 270–292.\\n[48] C. Marcolla, V. Sucasas, M. Manzano, R. Bassoli, F. H. Fitzek,\\nN. Aaraj, Survey on fully homomorphic encryption, theory, and\\napplications, Proceedings of the IEEE 110 (10) (2022) 1572–1609.\\n[49] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov,\\nK. Talwar, L. Zhang, Deep learning with differential privacy, in:\\nProceedings of the 2016 ACM SIGSAC Conference on Computer\\nand Communications Security, 2016, pp. 308–318.\\n[50] Public Voice coalition, Universal Guidelines for Artificial Intelli-\\ngence, https://thepublicvoice.org/ai-universal-guidelines/,\\nonline [accessed April 20th, 2023] (2018).\\n[51] Information Commissioner’s Office (ICO), How to use AI and\\npersonal data appropriately and lawfully, https://ico.org.uk/med\\nia/for-organisations/documents/4022261/how-to-use-ai-and-perso\\nnal-data.pdf, online [accessed April 20th, 2023] (2022).\\n[52] E. Union, Regulation (EU) 2022/868 of the European Parliament\\nand of the Council of 30 May 2022 on European data governance\\nand amending Regulation (EU) 2018/1724 (Data Governance Act)\\n(2022).\\n[53] E. Union, Proposal for a REGULATION OF THE EUROPEAN\\nPARLIAMENT AND OF THE COUNCIL on harmonised rules on\\nfair access to and use of data (Data Act) (2022).\\n[54] A. Barredo Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot,\\nS. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Ben-\\njamins, et al., Explainable Artificial Intelligence (XAI): Concepts,\\ntaxonomies, opportunities and challenges toward responsible AI,\\nInformation Fusion 58 (2020) 82–115.\\n[55] K. Haresamudram, S. Larsson, F. Heintz, Three levels of AI trans-\\nparency, Computer 56 (2) (2023) 93–100.\\n[56] B. Pérez, J. Rubio, C. Sáenz-Adán, A systematic review of prove-\\nnance systems, Knowledge and Information Systems 57 (2018) 495–\\n543.\\n[57] A. Holzinger, M. Dehmer, F. Emmert-Streib, R. Cucchiara, I. Au-\\ngenstein, J. Del Ser, W. Samek, I. Jurisica, N. Díaz-Rodríguez,\\nInformation fusion as an integrative cross-cutting enabler to achieve\\nrobust, explainable, and trustworthy medical artificial intelligence,\\nInformation Fusion 79 (2022) 263–278.\\n[58] S. Ali, T. Abuhmed, S. El-Sappagh, K. Muhammad, J. M. Alonso-\\nMoral, R. Confalonieri, R. Guidotti, J. Del Ser, N. Díaz-Rodríguez,\\nF. Herrera, Explainable Artificial Intelligence (XAI): What we know\\nand what is left to attain Trustworthy Artificial Intelligence, Infor-\\nmation Fusion (2023) 101805.\\n[59] M. T. Ribeiro, S. Singh, C. Guestrin, \\\"Why should I trust you?\\\"\\nExplaining the predictions of any classifier, in: Proceedings of\\nthe 22nd ACM SIGKDD International Conference on Knowledge\\nDiscovery and Data Mining, 2016, pp. 1135–1144.\\n[60] N. F. Rajani, B. McCann, C. Xiong, R. Socher, Explain yourself!\\nleveraging language models for commonsense reasoning, in: Pro-\\nceedings of the 57th Annual Meeting of the Association for Com-\\nputational Linguistics, Association for Computational Linguistics,\\nFlorence, Italy, 2019, pp. 4932–4942. doi:10.18653/v1/P19-1487.\\nURL https://aclanthology.org/P19-1487\\n[61] K. Abhishek, D. Kamath, Attribution-based xai methods in com-\\nputer vision: A review, arXiv preprint arXiv:2211.14736 (2022).\\n[62] R. Guidotti, A. Monreale, F. Giannotti, D. Pedreschi, S. Ruggieri,\\nF. Turini, Factual and counterfactual explanations for black box\\ndecision making, IEEE Intelligent Systems 34 (6) (2019) 14–23.\\n[63] J. van der Waa, E. Nieuwburg, A. Cremers, M. Neerincx, Evaluating\\nXAI: A comparison of rule-based and example-based explanations,\\nArtificial Intelligence 291 (2021) 103404.\\n[64] K. Kaczmarek-Majer, G. Casalino, G. Castellano, M. Dominiak,\\nO. Hryniewicz, O. Kamińska, G. Vessio, N. Díaz-Rodríguez, Ple-\\nnary: Explaining black-box models in natural language through\\nfuzzy linguistic summaries, Information Sciences 614 (2022) 374–\\n399.\\n[65] V. Bourgeais, F. Zehraoui, B. Hanczar, GraphGONet: a self-\\nexplaining neural network encapsulating the Gene Ontology graph\\nfor phenotype prediction on gene expression, Bioinformatics 38 (9)\\n(2022) 2504–2511.\\n[66] N. Díaz-Rodríguez, A. Lamas, J. Sanchez, G. Franchi, I. Donadello,\\nS. Tabik, D. Filliat, P. Cruz, R. Montes, F. Herrera, EXplainable\\nNeural-Symbolic Learning (X-NeSyL) methodology to fuse deep\\nlearning representations with expert knowledge graphs: The Mon-\\nuMAI cultural heritage use case, Information Fusion 79 (2022) 58–\\n83.\\n[67] L. Salewski, A. Koepke, H. Lensch, Z. Akata, CLEVR-X: A Visual\\nReasoning Dataset for Natural Language Explanations, in: Interna-\\ntional Workshop on Extending Explainable AI Beyond Deep Models\\nand Classifiers, Springer, 2022, pp. 69–88.\\n[68] G. Vilone, L. Longo, Notions of explainability and evaluation ap-\\nproaches for explainable artificial intelligence, Information Fusion\\n76 (2021) 89–106.\\n[69] I. Sevillano-Garcia, J. Luengo, F. Herrera, REVEL framework to\\nmeasure local linear explanations for black-box models: Deep learn-\\ning image classification case study, International Journal of Intelli-\\ngent Systems 2023 (2023) 8068569.\\n[70] I. Hupont, C. Fernández, Demogpairs: Quantifying the impact of\\ndemographic imbalance in deep face recognition, in: 14th IEEE\\nInternational Conference on Automatic Face & Gesture Recognition\\n(FG 2019), IEEE, 2019, pp. 1–7.\\n[71] M.-P. Fernando, F. Cèsar, N. David, H.-O. José, Missing the missing\\nvalues: The ugly duckling of fairness in machine learning, Interna-\\ntional Journal of Intelligent Systems 36 (7) (2021) 3217–3258.\\n[72] A. H. Gee, D. Garcia-Olano, J. Ghosh, D. Paydarfar, Explaining deep\\nclassification of time-series data with learned prototypes, in: CEUR\\nworkshop proceedings, Vol. 2429, NIH Public Access, 2019, p. 15.\\n[73] A. Cully, Y. Demiris, Quality and diversity optimization: A unifying\\nmodular framework, IEEE Transactions on Evolutionary Computa-\\ntion 22 (2) (2017) 245–259.\\n[74] S. Hajian, F. Bonchi, C. Castillo, Algorithmic bias: From discrim-\\nination discovery to fairness-aware data mining, in: Proceedings of\\nthe 22nd ACM SIGKDD International Conference on Knowledge\\nDiscovery and Data Mining, 2016, pp. 2125–2126.\\n[75] D. Pedreshi, S. Ruggieri, F. Turini, Discrimination-aware data min-\\ning, in: Proceedings of the 14th ACM SIGKDD International Con-\\nference on Knowledge Discovery and Data Mining, 2008, pp. 560–\\n568.\\n[76] N. Díaz-Rodríguez, G. Pisoni, Accessible cultural heritage through\\nexplainable artificial intelligence, in: Adjunct Publication of the 28th\\nACM Conference on User Modeling, Adaptation and Personaliza-\\ntion, 2020, pp. 317–324.\\n[77] B. Shneiderman, Human-centered AI, Oxford University Press,\\n2022.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 28 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n[78] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, A. Galstyan, A\\nsurvey on bias and fairness in machine learning, ACM Computing\\nSurveys (CSUR) 54 (6) (2021) 1–35.\\n[79] X. Gu, Z. Tianqing, J. Li, T. Zhang, W. Ren, K.-K. R. Choo, Pri-\\nvacy, accuracy, and model fairness trade-offs in federated learning,\\nComputers & Security 122 (2022) 102907.\\n[80] M. Du, R. Tang, W. Fu, X. Hu, Towards debiasing DNN models from\\nspurious feature influence, in: Proceedings of the AAAI Conference\\non Artificial Intelligence, Vol. 36, 2022, pp. 9521–9528.\\n[81] B. H. Zhang, B. Lemoine, M. Mitchell, Mitigating unwanted biases\\nwith adversarial learning, in: Proceedings of the 2018 AAAI/ACM\\nConference on AI, Ethics, and Society, 2018, pp. 335–340.\\n[82] U. Aïvodji, H. Arai, O. Fortineau, S. Gambs, S. Hara, A. Tapp,\\nFairwashing: the risk of rationalization, in: International Conference\\non Machine Learning, PMLR, 2019, pp. 161–170.\\n[83] U. Aïvodji, H. Arai, S. Gambs, S. Hara, Characterizing the risk of\\nfairwashing, Advances in Neural Information Processing Systems 34\\n(2021) 14822–14834.\\n[84] R. Baeza-Yates, Bias on the web, Communications of the ACM\\n61 (6) (2018) 54–61.\\n[85] A. Balayn, C. Lofi, G.-J. Houben, Managing bias and unfairness in\\ndata for decision support: a survey of machine learning and data\\nengineering approaches to identify and mitigate bias and unfairness\\nwithin data management and analytics systems, The VLDB Journal\\n30 (5) (2021) 739–768.\\n[86] J. Silberg, J. Manyika, Notes from the AI frontier: Tackling bias in\\nAI (and in humans), McKinsey Global Institute 1 (6) (2019).\\n[87] G. Smith, I. Rustagi, Mitigating Bias in Artificial Intelligence,\\nAn Equity Fluent Leadership Playbook, Berkeley Haas Center for\\nEquity, Gender and Leadership (2020).\\nURL https://haas.berkeley.edu/wp-content/uploads/UCB_Playbook_\\nR10_V2_spreads2.pdf\\n[88] A. Gulati, M. A. Lozano, B. Lepri, N. Oliver, BIASeD: Bringing\\nIrrationality into Automated System Design, in: Proceedings of the\\nThinking Fast and Slow and Other Cognitive Theories in AI (in\\nAAAI 2022 Fall Symposium), Vol. 3332, 2022.\\n[89] H. Suresh, J. Guttag, A framework for understanding sources of harm\\nthroughout the machine learning life cycle, in: Equity and access in\\nalgorithms, mechanisms, and optimization, 2021, pp. 1–9.\\n[90] S. Barocas, M. Hardt, A. Narayanan, Fairness and Machine Learn-\\ning: Limitations and Opportunities, fairmlbook.org, 2019, http:\\n//www.fairmlbook.org.\\n[91] J. Pearl, D. Mackenzie, The Book of Why, Basic Books, 2018.\\n[92] N. Díaz-Rodríguez, R. Binkyt˙e, W. Bakkali, S. Bookseller,\\nP. Tubaro, A. Bacevičius, S. Zhioua, R. Chatila, Gender and sex bias\\nin COVID-19 epidemiological data through the lenses of causality,\\nInformation Processing & Management 60 (3) (2023) 103276. doi:\\nhttps://doi.org/10.1016/j.ipm.2023.103276.\\nURL https://www.sciencedirect.com/science/article/pii/S0306457\\n323000134\\n[93] D. Rolnick, P. L. Donti, L. H. Kaack, K. Kochanski, A. La-\\ncoste, K. Sankaran, A. S. Ross, N. Milojevic-Dupont, N. Jaques,\\nA. Waldman-Brown, et al., Tackling climate change with machine\\nlearning, ACM Computing Surveys (CSUR) 55 (2) (2022) 1–96.\\n[94] S. Salcedo-Sanz, J. Pérez-Aracil, G. Ascenso, J. Del Ser, D. Casillas-\\nPérez, C. Kadow, D. Fister, D. Barriopedro, R. García-Herrera,\\nM. Restelli, et al., Analysis, characterization, prediction and attribu-\\ntion of extreme atmospheric events with machine learning: a review,\\narXiv preprint arXiv:2207.07580 (2022).\\n[95] J. Cowls, A. Tsamados, M. Taddeo, L. Floridi, The AI gambit:\\nleveraging artificial intelligence to combat climate change – opportu-\\nnities, challenges, and recommendations, AI & Society (2021) 1–25.\\n[96] K. Hao, Training a single AI model can emit as much carbon as five\\ncars in their lifetimes, MIT technology Review 75 (2019) 103.\\n[97] E. Strubell, A. Ganesh, A. McCallum, Energy and policy consid-\\nerations for deep learning in NLP, in: Proceedings of the 57th\\nAnnual Meeting of the Association for Computational Linguistics,\\nAssociation for Computational Linguistics, Florence, Italy, 2019, pp.\\n3645–3650. doi:10.18653/v1/P19-1355.\\nURL https://aclanthology.org/P19-1355\\n[98] D. Patterson, J. Gonzalez, U. Hölzle, Q. Le, C. Liang, L.-M.\\nMunguia, D. Rothchild, D. R. So, M. Texier, J. Dean, The carbon\\nfootprint of machine learning training will plateau, then shrink,\\nComputer 55 (7) (2022) 18–28.\\n[99] C.-J. Wu, R. Raghavendra, U. Gupta, B. Acun, N. Ardalani,\\nK. Maeng, G. Chang, F. Aga, J. Huang, C. Bai, et al., Sustainable\\nAI: Environmental implications, challenges and opportunities, Pro-\\nceedings of Machine Learning and Systems 4 (2022) 795–813.\\n[100] R. Schwartz, J. Dodge, N. A. Smith, O. Etzioni, Green AI, Commu-\\nnications of the ACM 63 (12) (2020) 54–63.\\n[101] A. Lacoste, A. Luccioni, V. Schmidt, T. Dandres, Quantify-\\ning the carbon emissions of machine learning, arXiv preprint\\narXiv:1910.09700 (2019).\\n[102] J. Maillo, I. Triguero, F. Herrera, Redundancy and complexity\\nmetrics for big data classification: Towards smart data, IEEE Access\\n8 (2020) 87918–87928.\\n[103] G. C. Marinó, A. Petrini, D. Malchiodi, M. Frasca, Deep neural\\nnetworks compression: A comparative survey and choice recom-\\nmendations, Neurocomputing 520 (2023) 152–170.\\n[104] R. Mishra, H. P. Gupta, T. Dutta, A survey on deep neural network\\ncompression: Challenges, overview, and solutions, arXiv preprint\\narXiv:2010.03954 (2020).\\n[105] D. Becking, M. Dreyer, W. Samek, K. Müller, S. Lapuschkin, ECQ:\\nExplainability-Driven Quantization for Low-Bit and Sparse DNNs,\\nin: International Workshop on Extending Explainable AI Beyond\\nDeep Models and Classifiers, Springer, 2022, pp. 271–296.\\n[106] G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural\\nnetwork, arXiv preprint arXiv:1503.02531 (2015).\\n[107] R. Traoré, H. Caselles-Dupré, T. Lesort, T. Sun, N. Díaz-Rodríguez,\\nD. Filliat, Continual reinforcement learning deployed in real-life\\nusing policy distillation and Sim2Real transfer, in: ICML Workshop\\non Multi-Task and Lifelong Reinforcement Learning, 2019.\\n[108] Y. Cheng, D. Wang, P. Zhou, T. Zhang, A survey of model com-\\npression and acceleration for deep neural networks, arXiv preprint\\narXiv:1710.09282 (2017).\\n[109] R. V. Zicari, J. Amann, F. Bruneault, M. Coffee, B. Düdder, E. Hick-\\nman, A. Gallucci, T. K. Gilbert, T. Hagendorff, I. van Halem,\\net al., How to assess trustworthy AI in practice, arXiv preprint\\narXiv:2206.09887 (2022).\\n[110] ISO/IEC, ISO/IEC TR 24029-1, Information technology — Arti-\\nficial Intelligence (AI) – Assessment of the robustness of neural\\nnetworks - Part 1: Overview, https://www.iso.org/standard/776\\n09.html (2021).\\n[111] D. V. Carvalho, E. M. Pereira, J. S. Cardoso, Machine learning\\ninterpretability: A survey on methods and metrics, Electronics 8 (8)\\n(2019) 832.\\n[112] J. H.-w. Hsiao, H. H. T. Ngai, L. Qiu, Y. Yang, C. C. Cao, Roadmap\\nof designing cognitive metrics for explainable artificial intelligence\\n(XAI), arXiv preprint arXiv:2108.01737 (2021).\\n[113] A. Rosenfeld, Better metrics for evaluating explainable artificial\\nintelligence, in: Proceedings of the 20th International Conference\\non Autonomous Agents and MultiAgent Systems, 2021, pp. 45–50.\\n[114] R. R. Hoffman, S. T. Mueller, G. Klein, J. Litman, Metrics\\nfor explainable AI: Challenges and prospects, arXiv preprint\\narXiv:1812.04608 (2018).\\n[115] F. Sovrano, S. Sapienza, M. Palmirani, F. Vitali, A survey on\\nmethods and metrics for the assessment of explainability under the\\nproposed AI Act, in: The Thirty-fourth Annual Conference on Legal\\nKnowledge and Information Systems (JURIX), Vol. 346, IOS Press,\\n2022, p. 235.\\n[116] A.-H. Karimi, J. von Kügelgen, B. Schölkopf, I. Valera, Towards\\ncausal algorithmic recourse, in: International Workshop on Extend-\\ning Explainable AI Beyond Deep Models and Classifiers, Springer,\\n2022, pp. 139–166.\\n[117] C. Novelli, M. Taddeo, L. Floridi, Accountability in artificial intel-\\nligence: what it is and how it works, AI & Society (2023) 1–12.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 29 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n[118] A. Institute, Algorithmic Accountability Policy Toolkit (2018).\\nURL https://ainowinstitute.org/aap-toolkit.pdf\\n[119] B. Kim, F. Doshi-Velez, Machine learning techniques for account-\\nability, AI Magazine 42 (1) (2021) 47–52.\\n[120] B. Xia, Q. Lu, H. Perera, L. Zhu, Z. Xing, Y. Liu, J. Whittle, Towards\\nconcrete and connected AI risk assessment (C2AIRA): A systematic\\nmapping study (2023). arXiv:2301.11616.\\n[121] J. A. Baquero, R. Burkhardt, A. Govindarajan, T. Wallace, Derisking\\nAI by design: How to build risk management into AI development,\\nMcKinsey & Company (2020).\\n[122] J. Mökander, M. Axente, F. Casolari, L. Floridi, Conformity assess-\\nments and post-market monitoring: A guide to the role of auditing\\nin the proposed european AI regulation, Minds and Machines 32 (2)\\n(2022) 241–268.\\n[123] A. Holzinger, A. Carrington, H. Müller, Measuring the quality of\\nexplanations: the system causability scale (SCS) comparing human\\nand machine explanations, KI-Künstliche Intelligenz 34 (2) (2020)\\n193–198.\\n[124] Z. Han, H. Yanco, Communicating missing causal information to\\nexplain a robot’s past behavior, ACM Transactions on Human-Robot\\nInteraction 12 (1) (2023) 1–45.\\n[125] M. Brundage, S. Avin, J. Wang, H. Belfield, G. Krueger, G. Hadfield,\\nH. Khlaaf, J. Yang, H. Toner, R. Fong, et al., Toward trustworthy\\nAI development: mechanisms for supporting verifiable claims, arXiv\\npreprint arXiv:2004.07213 (2020).\\n[126] L. Bourtoule, V. Chandrasekaran, C. A. Choquette-Choo, H. Jia,\\nA. Travers, B. Zhang, D. Lie, N. Papernot, Machine unlearning, in:\\nIEEE Symposium on Security and Privacy (SP), IEEE, 2021, pp.\\n141–159.\\n[127] S. Verma, V. Boonsanong, M. Hoang, K. E. Hines, J. P. Dickerson,\\nC. Shah, Counterfactual explanations and algorithmic recourses\\nfor machine learning: A review, in: NeurIPS 2020 Workshop: ML\\nRetrospectives, Surveys & Meta-Analyses (ML-RSA), 2020.\\n[128] S. Barocas, A. D. Selbst, Big data’s disparate impact, California Law\\nReview (2016) 671–732.\\n[129] L. Floridi, M. Holweg, M. Taddeo, J. Amaya Silva, J. Mökander,\\nY. Wen, CapAI-A procedure for conducting conformity assessment\\nof AI systems in line with the EU artificial intelligence act, Available\\nat SSRN 4064091 (2022).\\n[130] R. Parenti, Regulatory sandboxes and innovation hubs for fintech,\\nStudy for the Committee on Economic and Monetary Affairs, Policy\\nDepartment for Economic, Scientific and Quality of Life Policies,\\nEuropean Parliament, Luxembourg (2020) 65.\\n[131] F. Pop, L. Adomavicius, Sandboxes for responsible artificial intelli-\\ngence. eipa briefing september 2021. (2021).\\n[132] K. Yordanova, The EU AI Act-Balancing human rights and innova-\\ntion through regulatory sandboxes and standardization (2022).\\n[133] J. Soler Garrido, S. Tolan, I. Hupon Torres, D. Fernandez Llorca,\\nV. Charisi, E. Gomez Gutierrez, H. Junklewitz, R. Hamon, D. Fano\\nYela, C. Panigutti, AI Watch: Artificial intelligence standardisation\\nlandscape update, Tech. rep., Joint Research Centre (Seville site)\\n(2023).\\n[134] T. Madiega, A. L. Van De Pol, Artificial intelligence act and regu-\\nlatory sandboxes. EPRS European Parliamentary Research Service.\\nJune 2022 (2022).\\nURL https://www.europarl.europa.eu/RegData/etudes/BRIE/2022/73\\n3544/EPRS_BRI(2022)733544_EN.pdf\\n[135] Coalition for Health AI (CHAI), Blueprint for trustworthy AI imple-\\nmentation guidance and assurance for healthcare (2023).\\nURL https://www.coalitionforhealthai.org/papers/Blueprint%20fo\\nr%20Trustworthy%20AI.pdf\\n[136] J. Zhang, Z.-M. Zhang, Ethics and governance of trustworthy med-\\nical artificial intelligence, BMC Medical Informatics and Decision\\nMaking 23 (1) (2023) 1–15.\\n[137] J. Amann, D. Vetter, S. N. Blomberg, H. C. Christensen, M. Coffee,\\nS. Gerke, T. K. Gilbert, T. Hagendorff, S. Holm, M. Livne, et al., To\\nexplain or not to explain?– Artificial intelligence explainability in\\nclinical decision support systems, PLOS Digital Health 1 (2) (2022)\\ne0000016.\\n[138] R. V. Zicari, J. Brodersen, J. Brusseau, B. Düdder, T. Eichhorn,\\nT. Ivanov, G. Kararigas, P. Kringen, M. McCullough, F. Möslein,\\net al., Z-inspection®: a process to assess trustworthy AI, IEEE\\nTransactions on Technology and Society 2 (2) (2021) 83–97.\\n[139] H. Muller, M. T. Mayrhofer, E.-B. Van Veen, A. Holzinger, The\\nten commandments of ethical medical AI, Computer 54 (07) (2021)\\n119–123.\\n[140] K. Stöger, D. Schneeberger, A. Holzinger, Medical artificial intelli-\\ngence: the european legal perspective, Communications of the ACM\\n64 (11) (2021) 34–36.\\n[141] J. Baker-Brunnbauer, TAII Framework for Trustworthy AI systems,\\nROBONOMICS: The Journal of the Automated Economy 2 (2021)\\n17.\\n[142] Editorials, Writing the rules in ai-assisted writing, Nature Machine\\nIntelligence 469 (5) (2023) 469–469. doi:https://doi.org/10.1038/\\ns42256-023-00678-6.\\n[143] C. T. Zirpoli, Generative artificial intelligence and copyright law,\\nUnited States Congressional Research Service, CRS Legal Sidebar,\\n(February 23, 10922 (5 pages) (2023).\\n[144] J. Tang, A. LeBel, S. Jain, A. G. Huth, Semantic reconstruction\\nof continuous language from non-invasive brain recordings, Nature\\nNeuroscience (2023) 1–9.\\n[145] S. Reardon, Mind-reading machines are here: is it time to worry?,\\nNature 617 (7960) (2023) 236–236.\\n[146] R. Burnell, W. Schellaert, J. Burden, T. D. Ullman, F. Martinez-\\nPlumed, J. B. Tenenbaum, D. Rutar, L. G. Cheke, J. Sohl-Dickstein,\\nM. Mitchell, et al., Rethink reporting of evaluation results in ai,\\nScience 380 (6641) (2023) 136–138.\\n[147] P. Liang, R. Bommasani, T. Lee, D. Tsipras, D. Soylu, M. Yasunaga,\\nY. Zhang, D. Narayanan, Y. Wu, A. Kumar, et al., Holistic evaluation\\nof language models, arXiv preprint arXiv:2211.09110 (2022).\\n[148] T. Hernandez-Boussard, S. Bozkurt, J. P. Ioannidis, N. H. Shah,\\nMINIMAR (MINimum Information for Medical AI Reporting): de-\\nveloping reporting standards for artificial intelligence in health care,\\nJournal of the American Medical Informatics Association 27 (12)\\n(2020) 2011–2015.\\n[149] J. Schuett, A. Reuel, A. Carlier, How to design an AI ethics board,\\narXiv preprint arXiv:2304.07249 (2023).\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 30 of 30\\n\\n\\n\\n---\\n\\n\\n Three lines of defense against risks from AI \\nJonas Schuett* \\n \\nOrganizations that develop and deploy artificial intelligence (AI) systems need to \\nmanage the associated risks—for economic, legal, and ethical reasons. However, it is \\nnot always clear who is responsible for AI risk management. The Three Lines of De-\\nfense (3LoD) model, which is considered best practice in many industries, might offer \\na solution. It is a risk management framework that helps organizations to assign and \\ncoordinate risk management roles and responsibilities. In this article, I suggest ways \\nin which AI companies could implement the model. I also discuss how the model \\ncould help reduce risks from AI: it could identify and close gaps in risk coverage, \\nincrease the effectiveness of risk management practices, and enable the board of di-\\nrectors to oversee management more effectively. The article is intended to inform \\ndecision-makers at leading AI companies, regulators, and standard-setting bodies. \\n1 Introduction \\nOrganizations that develop and deploy artificial intelligence (AI) systems need \\nto manage the associated risks—for economic reasons, because accidents and \\ncases of misuse can threaten business performance (Cheatham, Javanmardian, \\n& Hamid Samandari, 2019), for legal reasons, because upcoming AI regulation \\nmight require them to implement a risk management system (Schuett, 2022), \\nand for ethical reasons, because AI could have large and long-lasting impacts \\non society (Clarke & Whitlestone, 2022). \\n \\nHowever, it is not always clear who is responsible for AI risk management: \\nThe researchers and engineers? The legal and compliance department? The \\ngovernance team? The Three Lines of Defense (3LoD) model might offer a \\nsolution. It is a risk management framework intended to improve an organiza-\\ntion’s risk governance by assigning and coordinating risk management roles \\nand responsibilities (Institute of Internal Auditors [IIA], 2013, 2020a). It is \\nconsidered best practice in many industries, such as finance and aviation. In \\nthis article, I apply the 3LoD model to an AI context. \\n \\nTo date, there has not been much academic work on the intersection of AI \\nand the 3LoD model. Nunn (2020) suggests using the model to reduce \\n__________ \\n* Research Fellow, Centre for the Governance of AI, Oxford, UK; Research Affiliate, \\nLegal Priorities Project, Cambridge, MA, USA; PhD Candidate, Faculty of Law, Goethe \\nUniversity Frankfurt, Germany; jonas.schuett@governance.ai. \\nThree lines of defense against risks from AI \\n2 \\ndiscrimination risks from AI, but the relevant passage is very short. There is \\nalso some literature on how companies could use AI to support the three lines \\n(Tammenga, 2020; Sekar, 2022), but I am mainly interested in how to govern \\nAI companies, not how to use AI to govern non-AI companies. It has also been \\nproposed that governments could use the 3LoD model to manage extreme risks \\nfrom AI (Ord, 2021), but here I focus on the challenges of companies, not gov-\\nernment. \\n \\nWhile academic scholarship on this topic may be limited, there is some rel-\\nevant work from practitioners. Most notably, there is a blog post by PwC that \\nseeks to answer questions similar to this article (Rao & Golbin, 2021). But \\nsince they only dedicate a short section to the 3LoD model, their proposal only \\nscratches the surface. The IIA has also published a three-part series, in which \\nthey propose an AI auditing framework (IIA, 2017a, 2017c, 2018). Although \\ntheir proposal contains a reference to the 3LoD model, it does not play a key \\nrole. Finally, the 3LoD model is mentioned in a playbook that the National \\nInstitute of Standards and Technology (NIST) published alongside the second \\ndraft of its AI Risk Management Framework (NIST, 2022a). However, the \\nplaybook only suggests implementing the 3LoD model (or a related mecha-\\nnism), it does not specify how to do so. \\n \\nTaken together, there are at least two gaps in the current literature. The first \\none is practical: there does not seem to be a concrete proposal for how organi-\\nzations that develop and deploy AI systems could implement the 3LoD model. \\nThe few proposals that exist are not detailed enough to provide meaningful \\nguidance. The second one is normative: there does not seem to be a thorough \\ndiscussion about whether implementing the model is even desirable. Given that \\nthe model has been criticized and there is not much empirical evidence for its \\neffectiveness, the answer to this question is not obvious. In light of this, the \\narticle seeks to answer two research questions: (1) How could organizations \\nthat develop and deploy AI systems implement the 3LoD model? (2) To what \\nextent would implementing the 3LoD model help reduce risks from AI? \\n \\nThe article has three areas of focus. First, it focuses on organizations that \\ndevelop and deploy state-of-the-art AI systems, in particular medium-sized re-\\nsearch labs (e.g. DeepMind and OpenAI) and big tech companies (e.g. Google \\nand Microsoft), though the boundaries between the two categories are blurry \\n(e.g. DeepMind is a subsidiary of Alphabet and OpenAI has a strategic part-\\nnership with Microsoft). In the following, I use the term “AI companies” to \\nrefer to all of them. I do not cover other types of companies (e.g. hardware \\ncompanies) or academic institutions, but they might also benefit from my anal-\\nysis. Second, the article focuses on the organizational dimension of AI risk \\nmanagement. It is not about how AI companies should identify, assess, and \\nrespond to risks from AI. Instead, it is about how they should assign and coor-\\ndinate risk management roles and responsibilities. Third, the article focuses on \\nthe model’s ability to prevent individual, collective, or societal harm (Smuha, \\n2021). I am less interested in risks to companies themselves (e.g. litigation or \\nJonas Schuett \\n3 \\nreputation risks), though occasionally private and public interests are aligned \\n(e.g. one way to reduce litigation risks is to prevent accidents). \\n \\nThe remainder of this article proceeds as follows. Section 2 gives an over-\\nview of the model’s basic structure, history, criticisms, and evidence base. Sec-\\ntion 3 suggests ways in which AI companies could implement the model. Sec-\\ntion 4 discusses how the model could help reduce risks from AI. Section 5 con-\\ncludes and suggests questions for further research.\\n2 The 3LoD model \\nIn this section, I give an overview of the basic structure (Section 2.1) and his-\\ntory of the 3LoD model (Section 2.2). I also engage with some of the main \\ncriticisms, briefly discuss alternative models (Section 2.3), and review the em-\\npirical evidence for its effectiveness (Section 2.4). \\n2.1 Basic structure \\nThere are different versions of the 3LoD model. Most practitioners and schol-\\nars are familiar with the version published by the IIA (2013). After a review \\nprocess, they published an updated version (IIA, 2020a), which increasingly \\nreplaces the original version. This article will mainly use the updated version, \\nas illustrated in Figure 1. The updated model has three types of elements: ac-\\ntors, roles, and relationships. \\n \\n \\n \\nFigure 1: The 3LoD model as described by the IIA (2020a) \\n \\nThe model distinguishes between four actors, represented as blue boxes: the \\ngoverning body, which is accountable to stakeholders for organizational over-\\nsight; management, which takes actions to achieve the organization’s objec-\\ntives; internal audit, which provides independent assurance to the governing \\nbody, as do external assurance providers. \\nThree lines of defense against risks from AI \\n2 \\n \\nThe model further distinguishes between four roles, represented as gray \\nboxes. The role of the governing body is to demonstrate integrity, leadership, \\nand transparency. In addition to that, the model contains three roles which it \\ncalls “lines of defense”. The first line provides products and services to clients, \\nand manages the associated risks. The second line assists the first line with \\nregards to risk management. It provides complementary expertise and support, \\nbut also monitors and challenges risk management practices. The third line \\nprovides independent and objective assurance and advice on all matters related \\nto the achievement of risk objectives. The first two lines are part of manage-\\nment, while the third line is synonymous with internal audit. \\n \\nFinally, there are three types of relationships between different actors, rep-\\nresented as arrows. There are top-down relationships: the governing body del-\\negates responsibility to management and oversees internal audit. Inversely, \\nthere are bottom-up relationships: management and internal audit are account-\\nable and report to the governing body. And lastly, there is a horizontal rela-\\ntionship between actors whose work must be aligned, namely between man-\\nagement and internal audit. \\n2.2 Brief history \\nThe model’s origins are opaque. There are theories suggesting military, sport-\\ning, or quality control origins (Davies & Zhivitskaya, 2018). It was presumably \\ndeveloped in the late 1990s or early 2000s. In 1999, the Basel Committee on \\nBanking Supervision (BCBS) suggested a similar approach to risk oversight \\n(BCBS, 1999), but the first explicit mention of the model was probably in a \\nreport by the UK Financial Services Authority (2003) or a paper by Roman \\nKräussl (2003). \\n \\nAfter the financial crisis of 2007-2008, which was partly caused by wide-\\nspread risk management failures (Boatright, 2016), the model’s popularity sky-\\nrocketed. In response to the crisis, regulators and supervisory authorities paid \\nincreasing attention to the chief risk officer (CRO) and the risk committee of \\nthe board (Walker, 2009; Davies & Zhivitskaya, 2018), and started recom-\\nmending the 3LoD model (BCBS, 2012; European Banking Authority, 2021). \\nMost academic work on the model was also done after the crisis (e.g. Davies \\n& Zhivitskaya, 2018; Bantleon et al., 2021) and many risk management pro-\\nfessionals only heard about the model in its aftermath (Zhivitskaya, 2015). \\n \\nToday, most listed companies have implemented the 3LoD model. In a 2015 \\nsurvey of internal audit professionals in 166 countries (n=14,518), the majority \\nof respondents (75%) reported that their organization follows the 3LoD model \\nas articulated by the IIA (Huibers, 2015).1 Another survey, conducted in 2021 \\namong chief audit executives (CAEs) in Austria, Germany, and Switzerland \\n(n=415), supports their findings (Bantleon et al., 2021). The majority of \\n__________ \\n1 Note that respondents who said they were not familiar with the model were excluded. \\nJonas Schuett \\n3 \\nrespondents (88%) reported that they had implemented the model, with partic-\\nularly high adoption rates among financial institutions (96%). \\n \\nIn contrast, big tech companies do not seem to have implemented the 3LoD \\nmodel. It is not mentioned in any of their filings to the US Securities and Ex-\\nchange Commission (SEC) or other publications. The model is also not explic-\\nitly mentioned in the corporate governance requirements by Nasdaq (2022), \\nwhere all big tech companies are listed. It is worth noting, however, that the \\nrisk oversight practices at big tech companies do have some similarities with \\nthe 3LoD model. For example, they all seem to have an internal audit function \\n(e.g. Microsoft, 2022; Alphabet, 2022). Based on public information, medium-\\nsized AI research labs do not seem to have implemented the model either. \\n2.3 Criticisms and alternative models \\nDespite the model’s popularity in many industries, it has also been criticized \\n(Arndorfer & Minto, 2015; Zhivitskaya, 2015; Davies & Zhivitskaya, 2018; \\nHoefer, Cooke, & Curry, 2020; Vousinas, 2021). Arndorfer and Minto (2015) \\nidentify four weaknesses and past failures of the 3LoD model. First, they argue, \\nthe incentives for risk-takers in the first line are often misaligned. When facing \\na tradeoff between generating profits and reducing risks, they have historically \\nbeen incentivized to prioritize the former. Second, there is often a lack of or-\\nganizational independence for second line functions. They are too close to \\nprofit-seekers, which can lead to the adoption of more risk-taking attitudes. \\nThird, second line functions often lack the necessary skills and expertise to \\nchallenge practices and controls in the first line. And fourth, the effectiveness \\nof internal audit depends on the knowledge, skills, and experience of individ-\\nuals, which might be inadequate. Another common criticism is that the model \\nprovides a false sense of security. Put simply, “when there are several people \\nin charge—no one really is” (Davies & Zhivitskaya, 2018). Another criticism \\nis that the model is too bureaucratic and costly. Additional layers of oversight \\nmight reduce risk, but they come at the cost of efficiency (Zhivitskaya, 2015). \\nA final criticism is that the model depends on information flow between the \\nlines, but there are many barriers to this. For example, the second line might \\nnot recognize that they only see what the first line chooses to show them \\n(Zhivitskaya, 2015). While these criticisms identify relevant shortcomings and \\nshould be taken seriously, they do not put into question the model as a whole. \\nMoreover, the 3LoD model has been improved over the years. Today, the focus \\nis on increasing the model’s effectiveness and responding to criticisms (Davies \\n& Zhivitskaya, 2018). \\n \\nIn view of these criticisms, several alternative models have been suggested. \\nFor example, Arndorfer and Minto (2015) proposed the Four Lines of Defense \\n(4LoD) model to better meet the needs of financial institutions. The fourth line \\nconsists of supervisory authorities and external audit, who are supposed to \\nwork closely with internal audit. Another example is the Five Lines of Assur-\\nance (5LoA) model, which was gradually developed by several scholars and \\nThree lines of defense against risks from AI \\n4 \\norganizations (Leech & Hanlon, 2016). However, the proposed changes do not \\nnecessarily improve the model. It has been argued that adding more lines would \\nover-complicate the model, and that firms and regulators currently do not want \\nstructural changes (Davies & Zhivitskaya, 2018). It is also worth noting that \\nthe alternative models are far less popular than the original model. Compared \\nto these alternative models, the 3LoD model remains “the most carefully artic-\\nulated risk management system that has so far been developed” (Davies & \\nZhivitskaya, 2018). But what empirical evidence do we have for its effective-\\nness? \\n2.4 Empirical evidence \\nBy “effectiveness”, I mean the degree to which the model helps organizations \\nto achieve their objectives. For the purpose of this article, I am mostly inter-\\nested in the achievement of risk objectives. This may include: (1) reducing \\nrelevant risks to an acceptable level (e.g. risks of harm to individuals, groups, \\nand society), (2) ensuring that management and the board of directors are aware \\nof the nature and scale of key risks, which allows them to define the organiza-\\ntion’s risk appetite (COSO, 2017), and (3) compliance with relevant risk regu-\\nlations (Schuett, 2022). I am less interested in other objectives (e.g. improving \\nfinancial performance), though there might be overlaps (e.g. reducing the risk \\nof harm to individuals might also reduce the risk of financial losses from liti-\\ngation cases). For an overview of different ways to measure the effectiveness \\nof internal audit, see Rupšys and Boguslauskas (2007), Savčuk (2007), and \\nBoţa-Avram and Palfi (2009). \\n \\nThere do not seem to be any (high-quality) studies on the effectiveness of \\nthe 3LoD model in the above-mentioned sense.2 There only seems to be evi-\\ndence for the effectiveness of internal audit (Lenz & Hahn, 2015; Eulerich & \\nEulerich, 2020). For example, a survey of CAEs at multinational companies in \\nGermany (n=37) compared audited and non-audited business units within the \\nsame company (Carcello et al., 2020). They found that managers of audited \\nunits perceive a greater decline in risk compared to managers of non-audited \\nunits. Other studies find that internal audit helps to strengthen internal control \\nsystems (Lin et al., 2011; Oussii & Taktak, 2018) and has a positive influence \\non the prevention and identification of fraud (Coram, Ferguson, & Moroney, \\n2008; Ma’ayan & Carmeli 2016; Drogalas et al., 2017). The fact that the 3LoD \\nmodel was not able to prevent past scandals and crises seems to provide weak \\n__________ \\n2 There is also not much evidence on the model’s effectiveness based on other interpre-\\ntations of effectiveness. The only exception seems to be a recent study of the 500 largest \\ncompanies in Denmark, which finds that a higher degree of adherence to first and second \\nline practices is positively associated with financial performance (Andersen, Sax, & Gian-\\nnozzi, 2022). Besides that, there are only studies on the effects of internal audit (Lenz & \\nHahn, 2015; Eulerich & Eulerich, 2020; Jiang, Messier, & Wood, 2020), none of which \\nmentions the 3LoD model. \\nJonas Schuett \\n5 \\nevidence against its effectiveness (though another explanation could be that the \\nmodel was poorly implemented in these cases), while the model’s ongoing \\npopularity seems to provide weak evidence in favor of its effectiveness (though \\nthe model’s popularity could also be explained by path dependencies). Finally, \\nthere is anecdotal evidence in both directions (Zhivitskaya, 2015). \\n \\nOverall, despite the model’s popularity, “its effectiveness [remains] un-\\ntested” (Davies & Zhivitskaya, 2018) and “not based on any clear evidence” \\n(Power, Ashby, & Palermo, 2013). To be clear, it is not the case that we have \\nrobust evidence that the model is ineffective. It is still very plausible that the \\nmodel can be effective, but there have not been (high-quality) studies providing \\nempirical evidence for its effectiveness in the above-mentioned sense. \\n \\nThis surprising lack of evidence could be explained by the following, rather \\nspeculative reasons. First, since it is not feasible to run randomized controlled \\ntrials on organizational interventions, it is inherently difficult to collect robust \\nevidence. Second, the model is designed to be flexible and adaptable, which \\nmeans that there is not a single, standardized way to implement it. This lack of \\nstandardization can make it difficult to compare different implementations of \\nthe model and to assess their effectiveness.3 Third, since most practitioners \\nmainly care about financial performance, scholars might be incentivized to fo-\\ncus on that to justify the relevance of their work (though there is not much \\nevidence on that either). \\n \\nEven if we had more empirical evidence from other industries, its informa-\\ntive value might still be limited. One reason is that findings might not general-\\nize to an AI context. AI companies are structurally different from other com-\\npanies because they have a special focus on research, and, since AI is a general-\\npurpose technology (Crafts, 2021), risks from AI are broader than risks from \\nother products and services. Another reason is that the biggest driver of the \\nmodel’s ability to reduce risks is likely the concrete way in which it is imple-\\nmented. Instead of asking “is the 3LoD model effective?”, AI companies \\nshould ask “how can we implement the model in an effective way?”. \\n3 Applying the 3LoD model to an AI context \\nThis section suggests ways in which AI companies could implement the 3LoD \\nmodel. For each of the three lines, I suggest equivalent roles and responsibili-\\nties. First, I describe the content of their responsibilities, then I discuss which \\nteam or individual would be responsible, as illustrated in Figure 2. \\n \\n__________ \\n3 This argument was suggested by OpenAI’s language model GPT-3. \\nThree lines of defense against risks from AI \\n6 \\n \\n \\nFigure 2: Sample org chart of an AI company with equivalent responsibilities \\nfor each of the three lines \\n3.1 First line \\nThe first line has two main responsibilities: providing products and services to \\nclients, which corresponds to AI research and product development, and man-\\naging the associated risks. Below, I focus on the latter. \\n \\nThe first line is responsible for establishing and maintaining appropriate \\nstructures and processes for the management of risk. This involves measures \\nalong all steps of the risk management process. For example, to identify risks \\nfrom AI, the first line could use risk taxonomies (Microsoft, 2020; Weidinger \\net al., 2021; Raji et al., 2022), incident databases (McGregor, 2021), or sce-\\nnario planning and wargaming (International Electrotechnical Commission \\n[IEC], 2019; Gyengo & Bruner, 2022). To estimate the likelihood and severity \\nof the identified risks, and to assess potential vulnerabilities, the first line might \\nuse Bayesian networks, Monte Carlo simulations, or penetration testing (IEC, \\n2019; International Organization for Standardization [ISO] & IEC, 2022). To \\nreduce risks, it could fine-tune the model on a curated dataset (Solaiman & \\nDennison, 2021), introduce a policy for the publication of potentially harmful \\nresearch (Partnership on AI, 2021; Solaiman et al., 2019), or only grant struc-\\ntured access to models (e.g. via an API) to reduce misuse risks (Shevelane, \\n2022). The first line could also take a more holistic approach and implement \\nan AI-specific risk management framework (e.g. NIST, 2022b; ISO & IEC, \\nn.d.) or customize a more general enterprise risk management (ERM) frame-\\nwork (e.g. ISO, 2018; Committee of Sponsoring Organizations of the Tread-\\nway Commission [COSO], 2017). \\n \\nThe first line is also responsible for ensuring compliance with legal, regula-\\ntory, and ethical expectations. Legal obligations might stem from anti-discrim-\\nination law (Wachter, Mittelstadt, & Russell, 2021), data protection law \\nJonas Schuett \\n7 \\n(Hamon et al., 2022), or antitrust law (Hua & Belfied, 2021). A notable exam-\\nple of AI regulation is the proposed EU AI Act (European Commission, 2021), \\nwhich requires providers of high-risk AI systems to implement a risk manage-\\nment system (Schuett, 2022). Ethical expectations might stem from AI ethics \\nprinciples that organizations have adopted on a voluntary basis (Jobin, Ienca, \\n& Vayena, 2019). To ensure compliance, the first line relies on support from \\nthe second line (see below). \\n \\nFinally, the first line is responsible for informing the governing body about \\nthe outcomes of the above-mentioned measures, the degree to which risk ob-\\njectives are met, and the overall level of risk. This should take the form of a \\ncontinuous dialogue, including reporting about expected and actual outcomes. \\nReporting will typically include heat maps and risk registers (IEC, 2019), but \\nit could also involve information about specific models, in the form of model \\ncards (Mitchell et al., 2018), data sheets (Gebru et al., 2018), and system cards \\n(Green et al., 2021). Note that there should also be a reporting line from the \\nCRO to the chief executive officer (CEO) and the risk committee of the board \\n(see below). \\n \\nResponsible are operational managers, often in a cascading responsibility \\nstructure. At big tech companies, the lowest level of responsibility would lie \\nwith those managers who are in charge of the development of individual AI \\nproducts. If there is no stand-alone AI product and AI systems make up only \\npart of a product (e.g. WaveNet as a part of Google Assistant), then the lowest \\nlevel of responsibility would lie with those managers who lead the develop-\\nment of the AI part of the product (e.g. the research lead for WaveNet). At \\nmedium-sized research labs, the lowest level of responsibility for risk manage-\\nment would lie with research leads, i.e. senior researchers who are in charge of \\nindividual research projects. \\n \\nThere will usually be one or more intermediate levels of responsibility. This \\nmight include a number of mid-level managers responsible for broader product \\nareas (e.g. gaming) or research areas (e.g. reinforcement learning), though the \\ndetails depend on the particular organizational structures. The ultimate respon-\\nsibility for AI risk management lies with those C-suite executives who are re-\\nsponsible for product development (e.g. the chief technology officer [CTO]) or \\nresearch (e.g. the chief scientific officer [CSO]). While it is possible to split \\nresponsibilities between two or more executives, this is often not advisable, \\nmainly because it can dilute responsibilities. \\n3.2 Second line \\nThe second line is responsible for assisting the first line with regards to risk \\nmanagement. It provides complementary expertise and support, but also mon-\\nitors and challenges risk management practices. \\n \\nSome risk management activities require special expertise that the first line \\ndoes not have. This might include legal expertise (e.g. how to comply with the \\nrisk management requirements set out in the proposed EU AI Act [Schuett, \\nThree lines of defense against risks from AI \\n8 \\n2022]), technical expertise (e.g. how to develop more truthful language models \\n[Evans et al., 2021]), or ethical expertise (e.g. how to define normative thresh-\\nolds for fairness [Kleinberg, Mullainathan, & Raghavan, 2016]). It might also \\ninclude risk-specific expertise (e.g. what risks language models pose \\n[Weidinger et al., 2021]) or risk management-specific expertise (e.g. best prac-\\ntices for red teaming safety filters [Rando et al., 2022]). The second line could \\nsupport the first line by drafting policies, processes, and procedures, as well as \\nframeworks, templates, and taxonomies. It might also advise on specific issues \\n(e.g. how to customize a risk management framework to better meet the spe-\\ncific needs of the company), provide general guidance (e.g. how to ensure com-\\npliance with safety-related policies among researchers and engineers), or offer \\ntrainings (e.g. how to process training data in a GDPR compliant way). \\n \\nThe second line is also responsible for monitoring and challenging the ade-\\nquacy and effectiveness of risk management practices. Risk management prac-\\ntices are ineffective if risk objectives are not met (e.g. the company fails to \\ncomply with relevant laws and regulations, or it is unable to reduce risks to an \\nacceptable level). They are inadequate if the same results could have been \\nachieved with fewer resources. The second line will typically use a number of \\nkey performance indicators (KPIs) to evaluate various dimensions of the ade-\\nquacy and effectiveness of risk management (e.g. number of identified risks, \\nnumber of incidents, or percentage of personnel trained on specific matters). \\n \\nSecond line responsibilities are split across multiple teams. This typically \\nincludes the risk management team as well as the legal and compliance team. \\nAlthough most big tech companies already have a risk management team, these \\nteams are mostly concerned with business risks (e.g. litigation or reputation \\nrisk). Risks from AI, especially societal risks, are usually not a major concern \\n(Smuha, 2021). If big tech companies want to change this, they could expand \\nthe responsibilities of existing teams. Setting up a new AI-specific risk man-\\nagement team seems less desirable, as it could lead to a diffusion of responsi-\\nbilities. There would likely be a cascading responsibility structure where the \\nCRO acts as the single point of accountability for the risk management process. \\nMedium-sized research labs usually do not have a dedicated risk management \\nteam. They could either set up a new team or task one or more people in other \\nteams with risk management-related support functions. \\n \\nAll AI companies beyond the early startup phase have a legal and compli-\\nance team. The team lead, and ultimately the chief compliance officer (CCO) \\nor chief legal officer (CLO), would be responsible for risk-related legal and \\ncompliance support. It is worth noting that the legal and compliance team can \\nalso be part of the first line if they are actually responsible for ensuring com-\\npliance. They are part of the second line if they do not have any decision power \\nand only support the first line (e.g. by writing legal opinions). The legal and \\ncompliance team can also seek support from external law firms. \\n \\nMany organizations that develop and deploy AI systems have other teams \\nthat could take on second line responsibilities. This might include technical \\nsafety, ethics, policy, or governance teams. However, in practice, these teams \\nJonas Schuett \\n9 \\nrarely consider themselves as being responsible for risk management. This \\nneeds to be taken into account when implementing the 3LoD model (e.g. by \\nrunning workshops to sensitize them to their widened responsibility). In gen-\\neral, AI companies should arguably avoid assigning second line responsibili-\\nties to them. \\n3.3 Third line \\nThe third line is responsible for providing independent assurance. It assesses \\nthe work of the first two lines and reports any shortcomings to the governing \\nbody. \\n \\nWhile the second line already monitors and challenges the adequacy and \\neffectiveness of the risk management practices, the third line independently \\nassesses their work—they supervise the supervisors, so to speak. They could \\ndo this by conducting an internal audit (Raji et al, 2020) or commissioning an \\nexternal audit (Buolamwini & Gebru, 2018; Mökander & Floridi, 2022). Such \\naudits could have different purposes and scopes (Mökander et al., 2022). They \\ncould evaluate compliance with laws, standards, or ethics principles (“compli-\\nance audit”) or seek to identify new risks in a more open-ended fashion (“risk \\naudit”). They could also assess the model itself, including the dataset it was \\ntrained on (“model audit”), the model’s impact (“impact audit”), or the com-\\npany’s governance (“governance audit”). Similarly, the third line could engage \\na red team before or after a model is deployed to assess if the first two lines \\nwere able to identify all relevant risks (Ganguli et al., 2022; Perez et al., 2022). \\nFor example, before OpenAI released DALL·E 2, they asked a group of exter-\\nnal experts to identify ways in which the model can be misused (Mishkin et al., \\n2022). In addition to that, the third line should also review key policies and \\nprocesses to find flaws and vulnerabilities (e.g. ways in which a policy that \\nrequires researchers to assess the societal impact of a model can be circum-\\nvented [Ashurst et al., 2022]). Note that this should also include a meta-assess-\\nment of the company’s implementation of the 3LoD model itself. \\n \\nThe third line also supports the governing body, typically the board of di-\\nrectors, by providing independent and objective information about the com-\\npany’s risk management practices (IIA, 2020b). Their main audience is usually \\nthe audit committee, which is mainly composed of non-executive directors. But \\nsince non-executive directors only work part-time and heavily depend on the \\ninformation provided to them by the executives, they need an independent ally \\nin the company to effectively oversee the executives (Davies & Zhivitskaya, \\n2018). The third line serves this function by maintaining a high degree of in-\\ndependence from management and reporting directly to the governing body \\nfollowing best practices. It is often described as their “eyes and ears” (IIA, \\n2020a). \\n \\nThe third line has a well-defined organizational home: internal audit. Note \\nthat, in this context, internal audit refers to a specific organizational unit. It \\ndoes not merely mean an audit that is done internally (Raji et al, 2020). Instead, \\nThree lines of defense against risks from AI \\n10 \\nit means “those individuals operating independently from management to pro-\\nvide assurance and insight on the adequacy and effectiveness of governance \\nand the management of risk (including internal control)” (IIA, 2020a). \\n \\nTypically, companies have a dedicated internal audit team, led by the CAE \\nor Head of Internal Audit. Most big tech companies have such a team, but sim-\\nilar to the risk management team, they often neglect the societal risks from AI. \\nInstead of creating a separate AI-specific internal audit team, they should cre-\\nate a sub-team within their existing internal audit team, or simply task one or \\nmore team members to focus on AI-specific risk management activities. Me-\\ndium-sized research labs usually do not have an internal audit team. They \\nwould have to create a new team or task at least one person with third line \\nresponsibilities. In short, big tech companies need to “bring AI to internal au-\\ndit”, while research labs need to “bring internal audit to AI”. It is worth noting \\nthat, although there are promising developments (IIA, 2017a, 2017c), the pro-\\nfession of AI-specific internal auditors is still in its infancy. \\n \\nSome AI companies have an ethics board (e.g. Microsoft’s Aether Commit-\\ntee and Facebook’s Oversight Board) which could also take on third line re-\\nsponsibilities, typically in addition to internal audit. It would have to be organ-\\nizationally independent from management, but still be part of the organization \\n(in contrast to external assurance providers). If organizations already have an \\nindependent ethics board (e.g. consisting of representatives from academia and \\ncivil society), they could form a working group that takes on third line respon-\\nsibilities. \\n4 How the 3LoD model could help reduce risks from AI \\nWhile there are many reasons why AI companies may want to implement the \\n3LoD model, this section focuses on three arguments about the model’s ability \\nto prevent individual, collective, and societal harm: the model could help re-\\nduce risks from AI by identifying and closing gaps in risk coverage (Sec-\\ntion 4.1), increasing the effectiveness of risk management practices (Sec-\\ntion 4.2), and enabling the governing body to oversee management more effec-\\ntively (Section 4.3). I also give an overview of other benefits (Section 4.4). It \\nis worth noting that, in the absence of robust empirical evidence (see above), \\nthe following discussion remains theoretical and often relies on abstract plau-\\nsibility considerations. \\n4.1 Identifying and closing gaps in risk coverage \\nAI risk management involves different people from different teams with dif-\\nferent responsibilities (Baquero et al., 2020). If these responsibilities are not \\ncoordinated adequately, gaps in risk coverage can occur (Bantleon et al., 2021). \\nSuch gaps may have different causes. For example, it might be the case that no \\none is responsible for managing a specific risk (e.g. there could be a blind spot \\nJonas Schuett \\n11 \\nfor diffuse risks), or it might be unclear who is responsible (e.g. two teams \\nmight incorrectly assume that the other team already takes care of a risk). Gaps \\ncould also occur if the responsible person is not able to manage the risk effec-\\ntively (e.g. because they do not have the necessary expertise, information, or \\ntime). If a specific risk is not sufficiently covered by the risk management sys-\\ntem, it cannot be identified, which might result in an incorrect risk assessment \\n(e.g. the total risk of an unsafe AI system is judged acceptable) and an inade-\\nquate risk response (e.g. an unsafe AI system is deployed without sufficient \\nsafety precautions). \\n \\nThe 3LoD model could prevent this by identifying and closing gaps in risk \\ncoverage. It could do this by offering a systematic way to assign and coordinate \\nrisk management-related roles and responsibilities. It ensures that people who \\nare closest to the risk are responsible for risk management (first line) and get \\nthe support they need (second line). Another way the 3LoD model can help \\nidentify blindspots is through the internal audit function (third line). They are \\nresponsible for assessing the adequacy and effectiveness of the entire risk man-\\nagement regime, which includes potential gaps in risk coverage. \\n \\nOne might object that, in practice, gaps in risk coverage are rare, and even \\nif they occur, they only concern minor risks (e.g. because AI companies have \\nfound other ways to address the biggest risks). However, the AI Incident Da-\\ntabase (McGregor, 2021) contains numerous entries, including several cases \\nclassified as “moderate” or “severe”, which indicates that incidents are not that \\nuncommon. While these incidents had many different causes, it seems plausi-\\nble that at least some of them were related to gaps in risk coverage. But since \\nthere does not seem to be any public data on this, the issue remains speculative. \\n \\nEven if one thinks that gaps in risk coverage are a common problem among \\nAI companies, one might question the model’s ability to identify and close \\nthem. One might suspect that the people involved and their ability and willing-\\nness to identify gaps play a much bigger role. While it is certainly true that \\nimplementing the model alone is not sufficient, neither is having able and will-\\ning personnel. Both are necessary and only together can they be sufficient \\n(though other factors, such as information sharing between different organiza-\\ntional units, might also play a role). \\n \\nOverall, it seems likely that implementing the 3LoD model would help un-\\ncover some gaps in risk coverage that would otherwise remain unnoticed. \\n4.2 Increasing the effectiveness of risk management practices \\nSome risk management practices are ineffective—they might look good on pa-\\nper, but do not work in practice. AI companies might fail to identify relevant \\nrisks, misjudge their likelihood or severity, or be unable to reduce them to an \\nacceptable level. Ineffective risk management practices can have many differ-\\nent causes, such as reliance on a single measure (e.g. using a single taxonomy \\nto identify a wide range of risks), a failure to anticipate deliberate attempts to \\ncircumvent measures (e.g. stealing an unreleased model), a failure to anticipate \\nThree lines of defense against risks from AI \\n12 \\nrelevant changes in the risk landscape (e.g. the emergence of systemic risks \\ndue to the increasing reliance on so-called “foundation models” [Bommasani \\net al., 2021]), cognitive biases of risk managers (e.g. the availability bias, i.e. \\nthe tendency to “assess the frequency of a class or the probability of an event \\nby the ease with which instances or occurrences can be brought to mind” \\n[Tversky & Kahneman, 1974]), and other human errors (e.g. a person filling \\nout a risk register slips a line), among other things. \\n \\nThe 3LoD model can increase the effectiveness of risk management prac-\\ntices by identifying such shortcomings. As mentioned above, internal auditors \\nassess the effectiveness of risk management practices (e.g. via audits or red \\nteaming exercises) and report any shortcomings to the governing body, which \\ncan engage with management to improve these practices. \\n \\nOne might object that most shortcomings only occur in low-stakes situa-\\ntions. In high-stakes situations, existing risk management practices are already \\nmore effective. For example, AI companies often conduct extensive risk as-\\nsessments before deploying state-of-the-art models (Brundage et al., 2022; Ka-\\nvukcuoglu et al., 2022). While this might be true in obvious cases, there are \\nless obvious cases where practices might not be as effective as intended (e.g. \\nbecause they are insensitive to human errors or deliberate attempts to circum-\\nvent them). I would certainly not want to rely on the counterargument that the \\neffectiveness of risk management practices already scales sufficiently with the \\nstakes at hand. \\n \\nSome AI companies might further object that they already have the equiva-\\nlent of an internal audit function, so implementing the 3LoD would only be a \\nmarginal improvement. While it might be true that some people at some com-\\npanies perform some tasks that are similar to what internal auditors do, to the \\nbest of my knowledge, assessing the effectiveness of risk management prac-\\ntices is not their main responsibility and they do not follow best practices from \\nthe internal audit profession, such as being organizationally independent from \\nmanagement (IIA, 2017b), which can lead to noticeable differences. \\n \\nOverall, I think this is one of the best arguments for implementing the 3LoD \\nmodel. Without a serious attempt to identify ineffective risk management prac-\\ntices, I expect at least some shortcomings to remain unnoticed. The degree to \\nwhich this is true mainly depends on internal audit’s ability and willingness to \\nserve this function. \\n4.3 Enabling the governing body to oversee management more effectively \\nThe governing body, typically the board of directors, is responsible for over-\\nseeing management. To do this, they need independent and objective infor-\\nmation about the company’s risk management practices. However, they heav-\\nily rely on information provided to them by the executives. To effectively over-\\nsee the executives, they need an independent ally in the company. \\n \\nInternal audit serves this function by maintaining a high degree of independ-\\nence from management and reporting directly to the audit committee of the \\nJonas Schuett \\n13 \\nboard. This can be important because, compared to other actors, the board has \\nsignificant influence over management. For example, they can replace the CEO \\n(e.g. if they repeatedly prioritize profits over safety), make strategic decisions \\n(e.g. blocking a strategic partnership with the military), and make changes to \\nthe company’s risk governance (e.g. setting up an ethics board). Note that there \\nis a complementary reporting line from the CRO to the risk committee of the \\nboard. \\n \\nOne might object that this function could also be served by other actors. For \\nexample, third-party auditors could also provide the board with independent \\nand objective information. While external audits can certainly play an im-\\nportant role, they have several disadvantages compared to internal audits: they \\nmight lack important context, companies might not want to share sensitive in-\\nformation with them (e.g. about ongoing research projects), and audits are typ-\\nically only snapshots in time. AI companies should therefore see external audit \\nas a complement to internal audit, not a substitution. There is a reason why the \\n3LoD model distinguishes between internal audit and external assurance pro-\\nviders. \\n \\nOne might further point out that in other industries, internal audit is often \\nperceived to intervene too late (Davies & Zhivitskaya, 2018) and to team up \\nwith management, instead of monitoring them (Roussy & Rodrigue, 2018). \\nThis would indeed be problematic. However, as discussed above, this does not \\nseem to be an inherent property of internal audit. Instead, it seems to be mainly \\ndriven by the particular way it is set up and the people involved. Having said \\nthat, AI companies should take this concern seriously and take measures to \\naddress it. \\n \\nOverall, I think that implementing the 3LoD model can significantly in-\\ncrease the board’s information base. This effect will be more noticeable at me-\\ndium-sized research labs, as most big tech companies already have an internal \\naudit function, albeit not an AI-specific one (see above). \\n4.4 Other benefits \\nImplementing the 3LoD model has many benefits other than reducing risks to \\nindividuals, groups, or society. Although these other benefits are beyond the \\nscope of this article, it seems warranted to at least give an overview. Below, I \\nbriefly discuss four of them. \\n \\nFirst, implementing the 3LoD model can avoid unnecessary duplications of \\nrisk coverage. Different people in different teams could be doing the same or \\nvery similar risk management work. This is often desirable because it can pre-\\nvent gaps in risk coverage (see above). But if such duplications are not neces-\\nsary, they can waste resources, such as labor, that could be used more produc-\\ntively elsewhere. AI companies therefore face an effectiveness-efficiency-\\ntradeoff. How this tradeoff ought to be resolved, depends on the particular con-\\ntext. For example, when dealing with catastrophic risks, effectiveness (prevent-\\ning gaps in risk coverage) seems more important than efficiency (avoiding \\nThree lines of defense against risks from AI \\n14 \\nunnecessary duplications of coverage). In this case, AI companies should \\nstrictly err on the side of too much coverage rather than risk gaps in important \\nareas. Overall, this benefit seems to be overstated and less relevant if one is \\nmainly concerned with risk reduction. \\n \\nSecond, AI companies that have implemented the 3LoD model might be \\nperceived as being more responsible. In general, risk management practices at \\nAI companies seem less advanced compared to many other industries (e.g. avi-\\nation or banking). By adapting existing best practices from other industries, \\nthey would signal that they aim to further professionalize their risk manage-\\nment practices, which could be perceived as being more responsible. This per-\\nception might have a number of benefits. For example, it could make it easier \\nto attract and retain talent that cares about ethics and safety. It could also help \\navoid overly burdensome measures from regulators. It might even be beneficial \\nin litigation cases for the question of whether or not an organization has ful-\\nfilled its duty of care. However, it seems questionable whether implementing \\nthe 3LoD model affects perception that much, especially compared to other \\ngovernance measures (e.g. publishing AI ethics principles or setting up an AI \\nethics board), mainly because most stakeholders, including most employees, \\ndo not know the model and cannot assess its relevance. An exception might be \\nregulators and courts who care more about the details of risk management prac-\\ntices. My best guess is that implementing the model will have noticeable effects \\non the perception of a few stakeholders, while most other stakeholders will not \\ncare. \\n \\nThird, implementing the 3LoD model can make it easier to hire risk man-\\nagement talent. The profession of AI risk management is in its infancy. I as-\\nsume that AI companies find it challenging to hire people with AI and risk \\nmanagement expertise. In most cases, they can either hire AI experts and train \\nthem in risk management, or hire risk management experts from other indus-\\ntries and train them in AI. Implementing the 3LoD model could make it easier \\nto hire risk management experts from other industries because they would al-\\nready be familiar with the model. This might become more important if one \\nassumes that AI companies will want to hire more risk management talent as \\nsystems get more capable and are used in more safety-critical situations (e.g. \\nDegrave et al., 2022). However, I do not find this argument very convincing. I \\ndoubt that implementing the 3LoD model would make a meaningful difference \\non relevant hiring decisions (e.g. on a candidate’s decision to apply or accept \\nan offer). Since the model is about the organizational dimension of risk man-\\nagement, it does not have significant effects on the day-to-day risk manage-\\nment work. Having said that, there might be smaller benefits (e.g. making the \\nonboarding process easier). My best guess is that the counterfactual impact of \\n3LoD implementation on hiring is low. \\n \\nFourth, implementing the 3LoD model might reduce financing costs. Rating \\nagencies tend to give better ratings to companies that have implemented an \\nERM framework (because doing so is considered best practice), and companies \\nwith better ratings tend to have lower financing costs (because they get better \\nJonas Schuett \\n15 \\ncredit conditions) (see Bohnert et al., 2019). There might be an analogous ef-\\nfect with regards to the implementation of the 3LoD model. Lower financing \\ncosts are particularly important if one assumes that the costs for developing \\nstate-of-the-art AI systems will increase because of increasing demand for \\ncompute (Sevilla et al., 2022), for example. In scenarios where commercial \\npressure is much higher than today, lower financing costs could also be im-\\nportant to continue safety research that does not contribute to product develop-\\nment. That said, I am uncertain to what extent the findings for ERM frame-\\nworks generalize to the 3LoD model. My best guess is that implementing the \\n3LoD would not have meaningful effects on the financing costs of medium-\\nsized research labs today. But I expect this to change as labs become more \\nprofitable and increasingly make use of other funding sources (e.g. credits or \\nbonds). \\n5 Conclusion \\nThis article has applied the 3LoD model to an AI context. It has suggested \\nconcrete ways in which medium-sized research labs like DeepMind and \\nOpenAI or big tech companies like Google and Microsoft could implement the \\nmodel to reduce risks from AI. It has argued that implementing the model could \\nprevent individual, collective, or societal harm by identifying and closing gaps \\nin risk coverage, increasing the effectiveness of risk management practices, \\nand enabling the governing body to oversee management more effectively. It \\nconcluded that, while there are some limitations and the effects should not be \\noverstated, the model can plausibly contribute to a reduction of risks from AI. \\n \\nBased on the findings of this article, I suggest the following questions for \\nfurther research. First, the article has highlighted the importance of internal \\naudit in AI risk management. But since there has not been much AI-specific \\nwork on internal audit, it would be valuable to review best practices from other \\nindustries (e.g. BCBS, 2012) and discuss the extent to which these practices \\nare applicable to an AI context. Second, my discussion of the model’s ability \\nto reduce risks from AI was mostly theoretical and relied on abstract plausibil-\\nity considerations. I encourage other scholars to assess these claims empiri-\\ncally. An industry case study similar to the one that Mökander and Floridi \\n(2022) conducted for ethics-based auditing could be a first step. Third, alt-\\nhough AI companies have not implemented the 3LoD model, they already per-\\nform many of the above-mentioned activities. To better target future work, it \\nwould be helpful to review existing risk management practices at leading AI \\ncompanies and conduct a gap analysis. Since public data is scarce, scholars \\nwould have to conduct interviews or surveys (e.g. an “AI risk management \\nbenchmark survey”), though I expect confidentiality to be a major obstacle. \\nFourth, the article has focused on the voluntary adoption of the 3LoD model. \\nIt would be important to know if existing or future regulations might even \\nThree lines of defense against risks from AI \\n16 \\nrequire AI companies to implement the model. For example, while Article 9 of \\nthe proposed EU AI Act does not mention the 3LoD model, it has been sug-\\ngested that future harmonized standards or common specifications should in-\\nclude the model (Schuett, 2022). The 3LoD model is also mentioned in the \\nplaybook that accompanies the NIST AI Risk Management Framework (NIST, \\n2022a, 2022b). It is conceivable that this framework will be translated into US \\nlaw, similar to the NIST Framework for Improving Critical Infrastructure Cy-\\nbersecurity (NIST, 2018). Finally, the article has investigated the 3LoD in iso-\\nlation. It has excluded contextual factors, such as the risk culture at AI compa-\\nnies, which might also affect the model’s effectiveness. A better understanding \\nof these factors would further improve the information base for decision-mak-\\ners at AI companies and beyond. \\n \\nAs famously put by George Box (1976), “all models are wrong, but some \\nare useful”. In the same spirit, one might say that the 3LoD model is not a silver \\nbullet against the risks from AI, but it can still play an important role. AI com-\\npanies should see it as one of many governance tools they can use to tackle \\ntoday’s and tomorrow’s threats from AI. \\nAcknowledgements \\nI am grateful for valuable comments and feedback from Leonie Koessler, \\nJames Ginns, Markus Anderljung, Andre Barbe, Noemie Dreksler, Toby \\nShevelane, Anne le Roux, Alexis Carlier, Emma Bluemke, Christoph Winter, \\nRenan Araújo, José Jaime Villalobos, Suzanne Van Arsdale, Alfredo Parra, \\nand Nick Hollman. All remaining errors are my own. \\nReferences \\nAlphabet. (2022). Notice of 2022 annual meeting of stockholders and proxy statement. SEC. \\nhttps://perma.cc/Q23E-WQWP \\nAndersen, T. J., Sax, J., & Giannozzi, A. (2022). Conjoint effects of interacting strategy-\\nmaking processes and lines of defense practices in strategic risk management: An empi-\\nrical \\nstudy. \\nLong \\nRange \\nPlanning, \\n55(6), \\n102164. \\nhttps://doi.org/10.1016/j.lrp.2021.102164 \\nArndorfer, I., & Minto, A. (2015). The “four lines of defence model” for financial instituti-\\nons. \\nFinancial \\nStability \\nInstitute, \\nBank \\nfor \\nInternational \\nSettlements. \\nhttps://perma.cc/UP35-KEYJ \\nAshurst, C., Hine, E., Sedille, P., & Carlier, A. (2022). AI ethics statements: Analysis and \\nlessons learnt from NeurIPS broader impact statements. 2022 ACM Conference on Fair-\\nness, \\nAccountability, \\nand \\nTransparency, \\n2047–2056. \\nhttps://doi.org/10.1145/3531146.3533780 \\nJonas Schuett \\n17 \\nBantleon, U., d’Arcy, A., Eulerich, M., Hucke, A., Pedell, B., & Ratzinger-Sakel, N. V. S. \\n(2021). Coordination challenges in implementing the three lines of defense model. Inter-\\nnational Journal of Auditing, 25(1), 59–74. https://doi.org/10.1111/ijau.12201 \\nBaquero, J. A., Burkhardt, R., Govindarajan, A., & Wallace, T. (2020). Derisking AI by \\ndesign: How to build risk management into AI development. McKinsey. \\nhttps://perma.cc/2WPN-A6CW \\nBCBS. (1999). Enhancing corporate governance for banking organisations. Bank for Inter-\\nnational Settlements. https://perma.cc/G2QP-7K5B \\nBCBS. (2012). The internal audit function in banks. Bank for International Settlements. \\nhttps://perma.cc/A57Q-8LZ6 \\nBoatright, J. (2016). Why risk management failed: Ethical and behavioral aspects. In A. G. \\nMalliaris, L. Shaw, & H. Shefrin (Eds.), The global financial crisis and its aftermath: \\nHidden \\nfactors \\nin \\nthe \\nmeltdown \\n(pp. 384–386). \\nOxford \\nUniversity \\nPress. \\nhttps://doi.org/10.1093/acprof:oso/9780199386222.003.0017 \\nBohnert, A., Gatzert, N., Hoyt, R. E., & Lechner, P. (2019). The drivers and value of enter-\\nprise risk management: Evidence from ERM ratings. The European Journal of Finance, \\n25(3), 234–255. https://doi.org/10.1080/1351847X.2018.1514314 \\nBommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. \\nS., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, \\nR., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., … Liang, P. (2021). On \\nthe opportunities and risks of foundation models. arXiv. https://doi.org/10.48550/ar-\\nXiv.2108.07258 \\nBoţa-Avram, C., & Palfi, C. (2009). Measuring and assessment of internal audit’s effec-\\ntiveness. Annals of Faculty of Economics, University of Oradea, 3(1), 784–790. \\nhttps://perma.cc/N5JC-J79R \\nBox, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, \\n71(356), 791–799. https://doi.org/10.1080/01621459.1976.10480949 \\nBrundage, M., Mayer, K., Eloundou, T., Agarwal, S., Adler, S., Krueger, G., Leike, J., & \\nMishkin, P. (2022, March 3). Lessons learned on language model safety and misuse. \\nOpenAI. https://perma.cc/8RKR-QJZY \\nBuolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in \\ncommercial gender classification. Proceedings of the 1st Conference on Fairness, Ac-\\ncountability and Transparency, 77–91. https://perma.cc/976J-AR93 \\nCarcello, J. V., Eulerich, M., Masli, A., & Wood, D. A. (2020). Are internal audits associated \\nwith reductions in perceived risk? Auditing: A Journal of Practice & Theory, 39(3), 55–\\n73. https://doi.org/10.2308/ajpt-19-036 \\nCheatham, B., Javanmardian, K., & Samandari, H. (2019). Confronting the risks of artificial \\nintelligence. McKinsey. https://perma.cc/T2CX-HYZF \\nClarke, S., & Whittlestone, J. (2022). A survey of the potential long-term impacts of AI: \\nHow AI could lead to long-term changes in science, cooperation, power, epistemics and \\nvalues. Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 192–\\n202. https://doi.org/10.1145/3514094.3534131 \\nCoram, P., Ferguson, C., & Moroney, R. (2008). Internal audit, alternative internal audit \\nstructures and the level of misappropriation of assets fraud. Accounting and Finance, \\n48(4), 543–559. https://doi.org/10.1111/j.1467-629X.2007.00247.x \\nCOSO. (2017). Enterprise risk management — Integrating with strategy and performance. \\nhttps://perma.cc/5Z3G-KD6R \\nThree lines of defense against risks from AI \\n18 \\nCrafts, N. (2021). Artificial intelligence as a general-purpose technology: An historical per-\\nspective. Oxford Review of Economic Policy, 37(3), 521–536. https://doi.org/10.1093/ox-\\nrep/grab012 \\nD'Onza, G., Selim, G. M., Melville, R., & Allegrini, M. (2015). A study on internal auditor \\nperceptions of the function ability to add value. International Journal of Auditing, 19(3), \\n182–194. https://doi.org/10.1111/ijau.12048 \\nDavies, H., & Zhivitskaya, M. (2018). Three lines of defence: A robust organising frame-\\nwork, \\nor \\njust \\nlines \\nin \\nthe \\nsand? \\nGlobal \\nPolicy, \\n9(S1), \\n34–42. \\nhttps://doi.org/10.1111/1758-5899.12568 \\nDegrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, \\nR., Abdolmaleki, A., de las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., \\nKeeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., … Riedmiller, M. (2022). \\nMagnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602, \\n414–419. https://doi.org/10.1038/s41586-021-04301-9 \\nDrogalas, G., Pazarskis, M., Anagnostopoulou, E., & Papachristou, A. (2017). The effect of \\ninternal audit effectiveness, auditor responsibility and training in fraud detection. Journal \\nof \\nAccounting \\nand \\nManagement \\nInformation \\nSystems \\n16(4), \\n434–454. \\nhttps://doi.org/10.24818/jamis.2017.04001 \\nEulerich, A., & Eulerich, M. (2020). What is the value of internal auditing? A literature \\nreview on qualitative and quantitative perspectives. Maandblad Voor Accountancy En \\nBedrijfseconomie, 94(3/4), 83–92. https://doi.org/10.5117/mab.94.50375 \\nEuropean Banking Authority. (2021). Final report on guidelines on internal governance \\nunder Directive 2013/36/EU (EBA/GL/2021/05). https://perma.cc/RCD8-V99V \\nEuropean Commission. (2021). Proposal for a regulation laying down harmonized rules on \\nartificial \\nintelligence \\n(Artificial \\nIntelligence \\nAct) \\n(COM(2021) \\n206 \\nfinal). \\nhttps://perma.cc/4YXM-38U9 \\nEvans, O., Cotton-Barratt, O., Finnveden, L., Bales, A., Balwit, A., Wills, P., Righetti, L., \\n& Saunders, W. (2021). Truthful AI: Developing and governing AI that does not lie. ar-\\nXiv. https://doi.org/10.48550/arXiv.2110.06674 \\nFinancial Services Authority. (2003). Building a framework for operational risk manage-\\nment: The FSA’s observations. https://perma.cc/5AX2-M2LF \\nGanguli, D., Lovitt, L., Kernion, J., Askell, A., Bai, Y., Kadavath, S., Mann, B., Perez, E., \\nSchiefer, N., Ndousse, K., Jones, A., Bowman, S., Chen, A., Conerly, T., DasSarma, N., \\nDrain, D., Elhage, N., El-Showk, S., Fort, S., … Clark, J. (2022). Red teaming language \\nmodels to reduce harms: Methods, scaling behaviors, and lessons learned. arXiv. \\nhttps://doi.org/10.48550/arXiv.2209.07858 \\nGebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumé III, H., & \\nCrawford, K. (2021). Datasheets for datasets. arXiv. https://doi.org/10.48550/ar-\\nXiv.1803.09010 \\nGreen, N., Procope, C., Cheema, A., & Adediji, A. (2022, February 23). System cards, a new \\nresource for understanding how AI systems work. Meta AI. https://perma.cc/CQZ8-FQ44 \\nGyengo, J., & Bruner, M. (2022). Scenario planning and wargaming: What they are and \\nwhy we need them in today’s uncertain future. KPMG. https://perma.cc/BT67-VCM9 \\nHamon, R., Junklewitz, H., Sanchez, I., Malgieri, G., & De Hert, P. (2022). Bridging the gap \\nbetween AI and explainability in the GDPR: Towards trustworthiness-by-design in auto-\\nmated decision-making. IEEE Computational Intelligence Magazine, 17(1), 72–85. \\nhttps://doi.org/10.1109/MCI.2021.3129960 \\nHoefer, E., Cooke, M., & Curry, T. (2020, September 8). Three lines of defense: Failed \\npromises and what comes next. Reuters. https://perma.cc/V35X-VTC5 \\nJonas Schuett \\n19 \\nHua, S.-S., & Belfield, H. (2021). AI & antitrust: Reconciling tensions between competition \\nlaw and cooperative AI development. Yale Journal of Law & Technology, 23(415). \\nhttps://perma.cc/4VL2-QNNJ \\nHuibers, S. C. J. (2015). Combined assurance: One language, one voice, one view. IIA Re-\\nsearch \\nFoundation, \\nGlobal \\nInternal \\nAudit \\nCommon \\nBody \\nof \\nKnowledge. \\nhttps://perma.cc/D7YM-9GSY \\nIEC. (2019). Risk management — Risk assessment techniques (IEC Standard \\nNo. 31010:2019) https://perma.cc/5CDZ-GU9R \\nIIA. (2013). IIA position paper: The three lines of defense in effective risk management and \\ncontrol. https://perma.cc/NQM2-DD7V \\nIIA. (2017a). Artificial intelligence: Considerations for the profession of internal auditing \\n(Part I). https://perma.cc/K8WQ-VNFZ \\nIIA. (2017b). International standards for the professional practice of internal auditing. \\nhttps://perma.cc/AKU7-8YWZ \\nIIA. (2017c). The IIA’s artificial intelligence auditing framework: Practical applications \\n(Part A). https://perma.cc/U93U-LN75 \\nIIA. (2018). The IIA’s artificial intelligence auditing framework. Practical applications \\n(Part B). https://perma.cc/826X-Y3L7 \\nIIA. (2020a). The IIA’s three lines model: An update of the three lines of defense. \\nhttps://perma.cc/GAB5-DMN3 \\nIIA. (2020b). Good practice internal audit reports. https://perma.cc/7BQT-DTRD \\nISO. (2018). Risk management — Guidelines (ISO Standard No. 31000:2018). \\nhttps://perma.cc/76EJ-GATT \\nISO & IEC. (2022). Information security, cybersecurity and privacy protection — Informa-\\ntion security management systems — Requirements (ISO/IEC Standard No. 27001:2022). \\nhttps://perma.cc/Y237-ZMG2 \\nISO & IEC. (n.d.). Information technology — Artificial intelligence — Guidance on risk \\nmanagement (ISO/IEC Standard No. 23894). https://perma.cc/GL3Q-WHQ2 \\nJiang, L., Messier, W. F., & Wood, D. A. (2020). The association between internal audit \\noperations-related services and firm operating performance. Auditing: A Journal of Prac-\\ntice & Theory, 39(1), 101–124. https://doi.org/10.2308/ajpt-52565 \\nJobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. \\nNature Machine Intelligence, 1, 389–399. https://doi.org/10.1038/s42256-019-0088-2 \\nKavukcuoglu, K., Kohli, P., Ibrahim, L., Bloxwich, D., & Brown, S. (2022, September 14). \\nHow \\nour \\nprinciples \\nhelped \\ndefine \\nAlphaFold’s \\nrelease. \\nDeepMind. \\nhttps://perma.cc/3ARS-XLNV \\nKleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair de-\\ntermination of risk scores. arXiv. https://doi.org/10.48550/arXiv.1609.05807 \\nKräussl, R. (2003). A critique on the proposed use of external sovereign credit ratings in \\nBasel II. \\nCenter \\nfor \\nFinancial \\nStudies, \\nGoethe \\nUniversity \\nFrankfurt. \\nhttps://perma.cc/PMB8-WSCA \\nLeech, T. J., & Hanlon, L. C. (2016). Three lines of defense versus five lines of assurance: \\nElevating the role of the board and CEO in risk governance. In R. Leblanc (Ed.), The \\nhandbook of board governance: A comprehensive guide for public, private and not-for-\\nprofit \\nboard \\nmembers \\n(pp. 335–355). \\nWiley. \\nhttps://doi.org/10.1002/9781119245445.ch17 \\nLenz, R., & Hahn, U. (2015) A synthesis of empirical internal audit effectiveness literature \\npointing to new research opportunities. Managerial Auditing Journal, 30(1), 5–33. \\nhttps://doi.org/10.1108/MAJ-08-2014-1072 \\nThree lines of defense against risks from AI \\n20 \\nLin, S., Pizzini, M., Vargus, M., & Bardhan, I. R. (2011). The role of the internal audit \\nfunction in the disclosure of material weaknesses. The Accounting Review, 86(1), 287–\\n323. https://doi.org/10.2308/accr.00000016 \\nMa’ayan, Y., & Carmeli, A. (2016). Internal audits as a source of ethical behavior, effi-\\nciency, and effectiveness in work units. Journal of Business Ethics, 137(2), 347–363. \\nhttps://doi.org/10.1007/s10551-015-2561-0 \\nMcGregor, S. (2021). Preventing repeated real world AI failures by cataloging incidents: \\nThe AI incident database. Proceedings of the AAAI Conference on Artificial Intelligence, \\n35(17), 15458–15463. https://doi.org/10.1609/aaai.v35i17.17817 \\nMicrosoft. (2020). Assessing harm: A guide for tech builders. https://perma.cc/PV3E-HL23 \\nMicrosoft. (2022). Notice of annual shareholders meeting and proxy statement 2022. SEC. \\nhttps://perma.cc/6NYQ-ZTMB \\nMishkin, P., Ahmad, L., Brundage, M., Krueger, G., & Sastry, G. (2022). DALL·E 2 preview: \\nRisks and limitations. GitHub. https://perma.cc/X467-47PX \\nMitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., Spitzer, E., \\nRaji, I. D., & Gebru, T. (2019). Model cards for model reporting. Proceedings of the \\nConference \\non \\nFairness, \\nAccountability, \\nand \\nTransparency, \\n220–229. \\nhttps://doi.org/10.1145/3287560.3287596 \\nMökander, J., & Floridi, L. (2022). Operationalising AI governance through ethics-based \\nauditing: An industry case study. AI and Ethics. https://doi.org/10.1007/s43681-022-\\n00171-7 \\nMökander, J., Schuett, J., Kirk, H. R., Floridi, L. (2022). Auditing large language models \\n[Unpublished manuscript]. \\nNasdaq. \\n(2022). \\nNasdaq \\n5600 \\nseries: \\nCorporate \\ngovernance \\nrequirements. \\nhttps://perma.cc/4M7B-U42F \\nNIST. (2018). Framework for improving critical infrastructure cybersecurity (Version 1.1). \\nhttps://doi.org/10.6028/NIST.CSWP.04162018 \\nNIST. (2022a). NIST AI risk management framework: Playbook (Govern 4.1). \\nhttps://perma.cc/T7MQ-MVJ7 \\nNIST. (2022b). NIST AI risk management framework: Second draft. https://perma.cc/6EJ9-\\nUZ9A \\nNunn, R. (2020). Discrimination in the age of algorithms. In W. Barfield (Ed.), The \\nCambridge handbook of the law of algorithms (pp. 182–198). Cambridge University \\nPress. https://doi.org/10.1017/9781108680844.010 \\nOrd, T. (2021). Proposal for a new ‘three lines of defence’ approach to UK risk management. \\nFuture of Humanity Institute, University of Oxford. https://perma.cc/VHH9-L36R \\nOussii, A. A., & Boulila Taktak, N. (2018). The impact of internal audit function charac-\\nteristics on internal control quality. Managerial Auditing Journal, 33(5), 450–469. \\nhttps://doi.org/10.1108/MAJ-06-2017-1579 \\nPartnership on AI. (2021). Managing the risks of AI research: Six recommendations for \\nresponsible publication. https://perma.cc/BX5A-KE8D \\nPerez, E., Huang, S., Song, F., Cai, T., Ring, R., Aslanides, J., Glaese, A., McAleese, N., & \\nIrving, G. (2022). Red teaming language models with language models. arXiv. \\nhttps://doi.org/10.48550/arXiv.2202.03286 \\nPower, M., Ashby, S., & Palermo, T. (2013). Risk culture in financial organisations: A re-\\nsearch \\nreport. \\nThe \\nLondon \\nSchool \\nof \\nEconomics \\nand \\nPolitical \\nScience. \\nhttps://perma.cc/R9YC-AT4Z \\nJonas Schuett \\n21 \\nRaji, I. D., Kumar, I. E., Horowitz, A., & Selbst, A. (2022). The fallacy of AI functionality. \\n2022 ACM Conference on Fairness, Accountability, and Transparency, 959–972. \\nhttps://doi.org/10.1145/3531146.3533158 \\nRaji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, \\nJ., Theron, D., & Barnes, P. (2020). Closing the AI accountability gap: Defining an end-\\nto-end framework for internal algorithmic auditing. arXiv. https://doi.org/10.48550/ar-\\nXiv.2001.00973 \\nRando, J., Paleka, D., Lindner, D., Heim, L., & Tramèr, F. (2022). Red-teaming the stable \\ndiffusion safety filter. arXiv. https://doi.org/10.48550/arXiv.2210.04610 \\nRao, A., & Golbin, I. (2021, January 11). Top-down and end-to-end governance for the \\nresponsible use of AI. Medium. https://perma.cc/SM8Y-6CUN \\nRoussy, M., & Rodrigue, M. (2018). Internal audit: Is the ‘third line of defense’ effective as \\na form of governance? An exploratory study of the impression management techniques \\nchief audit executives use in their annual accountability to the audit committee. Journal \\nof Business Ethics, 151, 853–869. https://doi.org/10.1007/s10551-016-3263-y \\nRupšys, R., & Boguslauskas, V. (2007). Measuring performance of internal auditing: Empi-\\nrical evidence. Engineering Economics, 55(5): 9–15. https://perma.cc/RA8D-35B3 \\nSavčuk, O. (2007). Internal audit efficiency evaluation principles. Journal of Business Eco-\\nnomics & Management, 8(4), 275–284. https://doi.org/10.3846/16111699.2007.9636180 \\nSchuett, J. (2022). Risk management in the Artificial Intelligence Act. arXiv. \\nhttps://doi.org/10.48550/arXiv.2212.03109 \\nSekar, M. (2022). Machine learning for auditors: Automating fraud investigations through \\nartificial intelligence. Apress. https://doi.org/10.1007/978-1-4842-8051-5 \\nSevilla, J., Heim, L., Ho, A., Besiroglu, T., Hobbhahn, M., & Villalobos, P. (2022). Compute \\ntrends across three eras of machine learning. arXiv. https://doi.org/10.48550/ar-\\nXiv.2202.05924 \\nShevlane, T. (2022). Structured access: An emerging paradigm for safe AI deployment. In \\nJ. B. Bullock, Y.-C. Chen, J. Himmelreich, V. M. Hudson, A. Korinek, M. M. Young, & \\nB. Zhang (Eds.), The Oxford handbook of AI governance. Oxford University Press. \\nhttps://doi.org/10.1093/oxfordhb/9780197579329.013.39 \\nSmuha, N. A. (2021). Beyond the individual: Governing AI’s societal harm. Internet Policy \\nReview, 10(3). https://doi.org/10.14763/2021.3.1574 \\nSolaiman, I., & Dennison, C. (2021). Process for adapting language models to society \\n(PALMS) with values-targeted datasets. Advances in Neural Information Processing Sys-\\ntems 34, 5861–5873. https://perma.cc/TF3N-SBPS \\nSolaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., Radford, A., \\nKrueger, G., Kim, J. W., Kreps, S., McCain, M., Newhouse, A., Blazakis, J., McGuffie, \\nK., & Wang, J. (2019). Release strategies and the social impacts of language models. \\narXiv. https://doi.org/10.48550/arXiv.1908.09203 \\nTammenga, A. (2020). The application of artificial intelligence in banks in the context of \\nthe three lines of defence model. Maandblad Voor Accountancy En Bedrijfseconomie, \\n94(5/6), 219–230. https://doi.org/10.5117/mab.94.47158 \\nTversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. \\nScience, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124 \\nVousinas, G. L. (2021). Beyond the three lines of defense: The five lines of defense model \\nfor financial institutions. ACRN Journal of Finance and Risk Perspectives, 10(1), 95–\\n110. https://doi.org/10.35944/jofrp.2021.10.1.006 \\nThree lines of defense against risks from AI \\n22 \\nWachter, S., Mittelstadt, B., & Russell, C. (2021). Why fairness cannot be automated: \\nBridging the gap between EU non-discrimination law and AI. Computer Law & Security \\nReview, 41, 105567. https://doi.org/10.1016/j.clsr.2021.105567 \\nWalker, D. (2009). A review of corporate governance in UK banks and other financial in-\\ndustry entities: Final recommendations. https://perma.cc/2K9C-EMME \\nWeidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P.-S., Cheng, M., Glaese, \\nM., Balle, B., Kasirzadeh, A., Kenton, Z., Brown, S., Hawkins, W., Stepleton, T., Biles, \\nC., Birhane, A., Haas, J., Rimell, L., Hendricks, L. A., … Gabriel, I. (2021). Ethical and \\nsocial risks of harm from language models. arXiv. https://doi.org/10.48550/ar-\\nXiv.2112.04359 \\nZhivitskaya, M. (2015). The practice of risk oversight since the global financial crisis: Clo-\\nsing the stable door? [Doctoral dissertation, The London School of Economics and Poli-\\ntical Science]. LSE Theses Online. https://perma.cc/KKA6-QK56 \\n\\n\"}\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "Tool Calls:\n", " arvix_search (46d8d4a4-8017-4c5c-b506-5ccf60101023)\n", " Call ID: 46d8d4a4-8017-4c5c-b506-5ccf60101023\n", " Args:\n", " query: Physics and Society August 11 2016\n", "=================================\u001B[1m Tool Message \u001B[0m=================================\n", "Name: arvix_search\n", "\n", "{\"arvix_results\": \"\\nCorrelations of consumption patterns in social-economic\\nnetworks\\nYannick Leo1, M´arton Karsai1,*, Carlos Sarraute2 and Eric Fleury1\\n1Univ Lyon, ENS de Lyon, Inria, CNRS, UCB Lyon 1, LIP UMR 5668, IXXI, F-69342, Lyon, France\\n2Grandata Labs, Bartolome Cruz 1818 V. Lopez. Buenos Aires, Argentina\\n*Corresponding author: marton.karsai@ens-lyon.fr\\nAbstract\\nWe analyze a coupled anonymized dataset collecting the\\nmobile phone communication and bank transactions his-\\ntory of a large number of individuals.\\nAfter mapping\\nthe social structure and introducing indicators of socioe-\\nconomic status, demographic features, and purchasing\\nhabits of individuals we show that typical consumption\\npatterns are strongly correlated with identified socioe-\\nconomic classes leading to patterns of stratification in\\nthe social structure.\\nIn addition we measure correla-\\ntions between merchant categories and introduce a cor-\\nrelation network, which emerges with a meaningful com-\\nmunity structure.\\nWe detect multivariate relations be-\\ntween merchant categories and show correlations in pur-\\nchasing habits of individuals. Our work provides novel\\nand detailed insight into the relations between social and\\nconsuming behaviour with potential applications in rec-\\nommendation system design.\\n1\\nIntroduction\\nThe consumption of goods and services is a cru-\\ncial element of human welfare.\\nThe uneven dis-\\ntribution of consumption power among individuals\\ngoes hand in hand with the emergence and reserva-\\ntion of socioeconomic inequalities in general.\\nIndi-\\nvidual financial capacities restrict personal consumer\\nbehaviour, arguably correlate with one’s purchas-\\ning preferences, and play indisputable roles in deter-\\nmining the socioeconomic position of an ego in the\\nlarger society [1, 2, 3, 4, 5].\\nInvestigation of rela-\\ntions between these characters carries a great poten-\\ntial in understanding better rational social-economic\\nbehaviour [6], and project to direct applications in\\npersonal marketing, recommendation, and advertis-\\ning.\\nSocial\\nNetwork\\nAnalysis\\n(SNA)\\nprovides\\none\\npromising direction to explore such problems [7], due\\nto its enormous benefit from the massive flow of hu-\\nman behavioural data provided by the digital data\\nrevolution [8].\\nThe advent of this era was propa-\\ngated by some new data collection techniques, which\\nallowed the recording of the digital footprints and in-\\nteraction dynamics of millions of individuals [9, 10].\\nOn the other hand, although social behavioural data\\nbrought us detailed knowledge about the structure\\nand dynamics of social interactions, it commonly\\nfailed to uncover the relationship between social and\\neconomic positions of individuals. Nevertheless, such\\ncorrelations play important roles in determining one’s\\nsocioeconomic status (SES) [11], social tie formation\\npreferences due to status homophily [12, 13], and in\\nturn potentially stand behind the emergent stratified\\nstructure and segregation on the society level [4, 14].\\nHowever until now, the coupled investigation of indi-\\nvidual social and economic status remained a great\\nchallenge due to lack of appropriate data recording\\nsuch details simultaneously.\\nAs individual economic status restricts one’s capac-\\nity in purchasing goods and services, it induces diver-\\ngent consumption patterns between people at differ-\\nent socioeconomic positions [6, 1, 2]. This is reflected\\nby sets of commonly purchased products, which are\\nfurther associated to one’s social status [15]. Con-\\nsumption behaviour has been addressed from vari-\\nous angles considering e.g. environmental effects, so-\\ncioeconomic position, or social influence coming from\\nconnected peers [1]. However, large data-driven stud-\\nies combining information about individual purchas-\\ning and interaction patterns in a society large pop-\\nulation are still rare, although questions about cor-\\nrelations between consumption and social behaviour\\n1\\narXiv:1609.03756v2 [cs.SI] 21 Dec 2017\\nare of utmost interest.\\nIn this study we address these crucial problems\\nvia the analysis of a dataset,\\nwhich simultane-\\nously records the mobile-phone communication, bank\\ntransaction history, and purchase sequences of mil-\\nlions of inhabitants of a single country over several\\nmonths.\\nThis corpus, one among the firsts at this\\nscale and details, allows us to infer the socioeconomic\\nstatus, consumption habits, and the underlying social\\nstructure of millions of connected individuals. Using\\nthis information our overall goal is to identify people\\nwith certain financial capacities, and to understand\\nhow much money they spend, on what they spend,\\nand whether they spend like their friends? More pre-\\ncisely, we formulate our study around two research\\nquestions:\\n• Can one associate typical consumption patterns\\nto people and to their peers belonging to the\\nsame or different socioeconomic classes, and if\\nyes how much such patterns vary between indi-\\nviduals or different classes?\\n• Can one draw relations between commonly pur-\\nchased goods or services in order to understand\\nbetter individual consumption behaviour?\\nAfter reviewing the related literature in Section 2,\\nwe describe our dataset in Section 3, and introduce\\nindividual socioeconomic indicators to define socioe-\\nconomic classes in Section 4. In Section 5 we show\\nhow typical consumption patterns vary among classes\\nand relate them to structural correlations in the social\\nnetwork. In Section 6 we draw a correlation network\\nbetween consumption categories to detect patterns of\\ncommonly purchased goods and services. Finally we\\npresent some concluding remarks and future research\\nideas.\\n2\\nRelated work\\nEarlier hypothesis on the relation between consump-\\ntion patterns and socioeconomic inequalities, and\\ntheir correlations with demographic features such as\\nage, gender, or social status were drawn from spe-\\ncific sociological studies [16] and from cross-national\\nsocial surveys [17]. However, recently available large\\ndatasets help us to effectively validate and draw new\\nhypotheses as population-large individual level obser-\\nvations and detailed analysis of human behavioural\\ndata became possible. These studies shown that per-\\nsonal social interactions, social influence [1], or ho-\\nmophily [22] in terms of age or gender [20] have strong\\neffects on purchase behaviour, knowledge which led\\nto the emergent domain of online social market-\\ning [21].\\nYet it is challenging to measure correla-\\ntions between individual social status, social network,\\nand purchase patterns simultaneously. Although so-\\ncioeconomic parameters can be estimated from com-\\nmunication networks [18] or from external aggregate\\ndata [19] usually they do not come together with indi-\\nvidual purchase records. In this paper we propose to\\nexplore this question through the analysis of a com-\\nbined dataset proposing simultaneous observations of\\nsocial structure, economic status and purchase habits\\nof millions of individuals.\\n3\\nData description\\nIn the following we are going to introduce two\\ndatasets extracted from a corpus combining the mo-\\nbile phone interactions with purchase history of indi-\\nviduals.\\nDS1: Ego social-economic data with\\npurchase distributions\\nCommunication data used in our study records the\\ntemporal sequence of 7,945,240,548 call and SMS in-\\nteractions of 111,719,360 anonymized mobile phone\\nusers for 21 consecutive months. Each call detailed\\nrecord (CDR) contains the time, unique caller and\\ncallee encrypted IDs, the direction (who initiate the\\ncall/SMS), and the duration of the interaction. At\\nleast one participant of each interaction is a client of a\\nsingle mobile phone operator, but other mobile phone\\nusers who are not clients of the actual provider also\\nappear in the dataset with unique IDs. All unique\\nIDs are anonymized as explained below, thus indi-\\nvidual identification of any person is impossible from\\nthe data. Using this dataset we constructed a large\\nsocial network where nodes are users (whether clients\\nor not of the actual provider), while links are drawn\\nbetween any two users if they interacted (via call or\\nSMS) at least once during the observation period. We\\nfiltered out call services, companies, and other non-\\nhuman actors from the social network by removing\\nall nodes (and connected links) who appeared with\\neither in-degree kin = 0 or out-degree kout = 0.\\nWe repeated this procedure recursively until we re-\\nceived a network where each user had kin, kout > 0,\\ni.\\ne.\\nmade at least one out-going and received at\\nleast one in-coming communication event during the\\nnearly two years of observation. After construction\\n2\\nand filtering the network remained with 82,453,814\\nusers connected by 1,002,833,289 links, which were\\nconsidered to be undirected after this point.\\nTo calculate individual economic estimators we\\nused a dataset provided by a single bank. This data\\nrecords financial details of 6,002,192 people assigned\\nwith unique anonymized identifiers over 8 consecutive\\nmonths.\\nThe data provides time varying customer\\nvariables as the amount of their debit card purchases,\\ntheir monthly loans, and static user attributes such\\nas their billing postal code (zip code), their age and\\ntheir gender.\\nA subset of IDs of the anonymized bank and mobile\\nphone customers were matched1. This way of com-\\nbining the datasets allowed us to simultaneously ob-\\nserve the social structure and estimate economic sta-\\ntus (for definition see Section 4) of the connected in-\\ndividuals. This combined dataset contained 999,456\\nIDs, which appeared in both corpuses.\\nHowever,\\nfor the purpose of our study we considered only the\\nlargest connected component of this graph. This way\\nwe operate with a connected social graph of 992,538\\npeople connected by 1,960,242 links, for all of them\\nwith communication events and detailed bank records\\navailable.\\nTo study consumption behaviour we used purchase\\nsequences recording the time, amount, merchant cat-\\negory code of each purchase event of each individual\\nduring the observation period of 8 months. Purchase\\nevents are linked to one of the 281 merchant cate-\\ngory codes (mcc) indicating the type of the actual\\npurchase, like fast food restaurants, airlines, gas sta-\\ntions, etc. Due to the large number of categories in\\nthis case we decided to group mccs by their types into\\n28 purchase category groups (PCGs) using the cate-\\ngorization proposed in [23]. After analyzing each pur-\\nchase groups 11 of them appeared with extremely low\\nactivity representing less than 0.3% (combined) of the\\ntotal amount of purchases, thus we decided to remove\\nthem from our analysis and use only the remaining\\nK17 set of 17 groups (for a complete list see Fig.2a).\\nNote that the group named Service Providers (k1\\nwith mcc 24) plays a particular role as it corresponds\\nto cash retrievals and money transfers and it repre-\\nsents around 70% of the total amount of purchases.\\nAs this group dominates over other ones, and since\\nwe have no further information how the withdrawn\\n1 The matching, data hashing, and anonymization proce-\\ndure was carried out without the involvement of the scientific\\npartner.\\nAfter this procedure only anonymized hashed IDs\\nwere shared disallowing the direct identification of individuals\\nin any of the datasets.\\ncash was spent, we analyze this group k1 separately\\nfrom the other K2-17 = K17\\\\{k1} set of groups.\\nThis way we obtained DS1, which collects the social\\nties, economic status, and coarse grained purchase\\nhabit informations of ∼1 million people connected\\ntogether into a large social network.\\nDS2: Detailed ego purchase distributions\\nwith age and gender\\nFrom the same bank transaction trace of 6,002,192\\nusers, we build a second data set DS2. This dataset\\ncollects data about the age and gender of individu-\\nals together with their purchase sequence recording\\nthe time, amount, and mcc of each debit card pur-\\nchase of each ego. To obtain a set of active users we\\nextracted a corpus of 4,784,745 people that were ac-\\ntive at least two months during the observation pe-\\nriod. Then for each ego, we assigned a feature set\\nPV (u) : {ageu, genderu, SEGu, r(ci, u)} where SEG\\nassigns a socioeconomic group (for definition see Sec-\\ntion 4) and r(ci, u) is an ego purchase distribution\\nvector defined as\\nr(ci, u) =\\nmci\\nu\\nP\\nci mci\\nu\\n.\\n(1)\\nThis vector assigns the fraction of mci\\nu money spent\\nby user u on a merchant category ci during the obser-\\nvation period. We excluded purchases corresponding\\nto cash retrievals and money transfers, which would\\ndominate our measures otherwise. A minor fraction\\nof purchases are not linked to valid mccs, thus we\\nexcluded them from our calculations.\\nThis way DS2 collects 3,680,652 individuals, with-\\nout information about their underlying social net-\\nwork, but all assigned with a PV (u) vector describing\\ntheir personal demographic and purchasing features\\nin details.\\n4\\nMeasures of socioeconomic position\\nTo estimate the personal economic status we used a\\nsimple measure reflecting the consumption power of\\neach individual. Starting from the raw data of DS2,\\nwhich collects the amount and type of debit card pur-\\nchases, we estimated the economic position of individ-\\nuals as their average monthly purchase (AMP). More\\nprecisely, in case of an ego u who spent mu(t) amount\\nin month t we calculated the AMP as\\nPu =\\nP\\nt∈T mu(t)\\n|T|u\\n(2)\\n3\\nwhere |T|u corresponds to the number of active\\nmonths of user u (with at least one purchase in each\\nmonth). After sorting people by their AMP values\\nwe computed the normalized cumulative distribution\\nfunction of Pu as\\nC(f) =\\nPf\\nf ′=0 Pu(f ′)\\nP\\nu Pu\\n(3)\\nas a function of f fraction of people.\\nThis func-\\ntion (Fig.1a) appears with high variance and sug-\\ngests large imbalances in terms of the distribution of\\neconomic capacities among individuals in agreement\\nwith earlier social theory [27].\\n0.0\\n0.2\\n0.4\\n0.6\\n0.8\\n1.0\\nf\\n0.0\\n0.2\\n0.4\\n0.6\\n0.8\\n1.0\\nCW(f)\\nCP(f)\\nf\\n(a)\\nClass 1\\nClass 4\\nClass 2\\nClass 3\\nClass 5\\nClass 8\\nClass 6\\nClass 7\\nClass 9\\n(a)\\n(b)\\nFig. 1: Social class characteristics (a) Schematic\\ndemonstration of user partitions into 9 socioe-\\nconomic classes by using the cumulative AMP\\nfunction C(f). Fraction of egos belonging to\\na given class (x axis) have the same sum of\\nAMP (P\\nu Pu)/n (y axis) for each class. (b)\\nNumber of egos (green) and the average AMP\\n⟨P⟩(in USD) per individual (yellow) in differ-\\nent classes.\\nSubsequently we used the C(f) function to assign\\negos into 9 economic classes (also called socioeco-\\nnomic classes with smaller numbers assigning lower\\nclasses) such that the sum of AMP in each class sj\\nwas the same equal to (P\\nu Pu)/n (Fig.1). We de-\\ncided to use 9 distinct classes based on the common\\nthree-stratum model [25], which identifies three main\\nsocial classes (lower, middle, and upper), and for each\\nof them three sub-classes [26]. There are several ad-\\nvantages of this classification:\\n(a) it relies merely\\non individual economic estimators, Pu, (b) naturally\\npartition egos into classes with decreasing sizes for\\nricher groups and (c) increasing ⟨P⟩average AMP\\nvalues per egos (Fig.1b).\\n5\\nSocioeconomic correlations in\\npurchasing patterns\\nIn order to address our first research question we\\nwere looking for correlations between individuals in\\ndifferent socioeconomic classes in terms of their con-\\nsumption behaviour on the level of purchase category\\ngroups.\\nWe analyzed the purchasing behaviour of\\npeople in DS1 after categorizing them into socioeco-\\nnomic classes as explained in Section 4.\\nFirst for each class sj we take every user u ∈sj\\nand calculate the mk\\nu total amount of purchases they\\nspent on a purchase category group k ∈K17. Then\\nwe measure a fractional distribution of spending for\\neach PCGs as:\\nr(k, sj) =\\nP\\nu∈sj mk\\nu\\nP\\nu∈s mku\\n,\\n(4)\\nwhere s = S\\nj sj assigns the complete set of users.\\nIn Fig.2a each line shows the r(k, sj) distributions\\nfor a PCG as the function of sj social classes, and\\nlines are sorted (from top to bottom) by the total\\namount of money spent on the actual PCG2. Interest-\\ningly, people from lower socioeconomic classes spend\\nmore on PCGs associated to essential needs, such as\\nRetail Stores (St.), Gas Stations, Service Providers\\n(cash) and Telecom, while in the contrary, other cat-\\negories associated to extra needs such as High Risk\\nPersonal Retail (Jewelry, Beauty), Mail Phone Or-\\nder, Automobiles, Professional Services (Serv.) (ex-\\ntra health services), Whole Trade (auxiliary goods),\\nClothing St., Hotels and Airlines are dominated by\\npeople from higher socioeconomic classes. Also note\\nthat concerning Education most of the money is spent\\nby the lower middle classes, while Miscellaneous St.\\n(gift, merchandise, pet St.) and more apparently En-\\ntertainment are categories where the lowest and high-\\nest classes are spending the most.\\nFrom this first analysis we can already identify\\nlarge differences in the spending behaviour of peo-\\nple from lower and upper classes.\\nTo further in-\\nvestigate these dissimilarities on the individual level,\\nwe consider the K2-17 category set as defined in sec-\\ntion 3 (category k1 excluded) and build a spending\\nvector SV (u) = [SV2(u), ..., SV17(u)] for each ego u.\\n2 Note that in our social class definition the cumulative AMP\\nis equal for each group and this way each group represents the\\nsame economic potential as a whole. Values shown in Fig.2a\\nassign the total purchase of classes. Another strategy would\\nbe to calculate per capita measures, which in turn would be\\nstrongly dominated by values associated to the richest class,\\nhiding any meaningful information about other classes.\\n4\\n(a)\\n(b)\\n(d)\\n(c)\\n(e)\\n(g)\\n(f)\\nFig. 2: Consumption correlations in the socioeconomic network (a) r(k, si) distribution of spending\\nin a given purchase category group k ∈K17 by different classes sj. Distributions are normalised\\nas in Eq.4, i.e. sums up to 1 for each category. (b) Dispersion σSV (sj) for different socioeconomic\\nclasses considering PCGs in K2-17 (dark blue) and the single category k1 (light blue). (c) (resp.\\n(d)) Heat-map matrix representation of dSV (si, sj) (resp. dk1(si, sj)) distances between the average\\nspending vectors of pairs of socioeconomic classes considering PCGs in K2-17 (resp. k1). (e) Shannon\\nentropy measures for different socioeconomic classes considering PCGs in K2-17 (dark pink) and in\\nk17 (light pink). (f) (resp. (g)) Heat-map matrix representation of the average LSV (si, sj) (resp.\\nLk1(si, sj)) measure between pairs of socioeconomic classes considering PCGs in K2-17 (resp. k1).\\nHere each item SVk(u) assigns the fraction of money\\nmk\\nu/mu that user u spent on a category k ∈K2-17\\nout of his/her mu = P\\nk∈K mk\\nu total amount of pur-\\nchases. Using these individual spending vectors we\\ncalculate the average spending vector of a given so-\\ncioeconomic class as SV (sj) = ⟨SV (u)⟩u∈sj. We as-\\nsociate SV (sj) to a representative consumer of class\\nsj and use this average vector to quantify differences\\nbetween distinct socioeconomic classes as follows.\\nThe euclidean metric between average spending\\nvectors is:\\ndSV (si, sj) = ∥SV k(si) −SV k(sj)∥2,\\n(5)\\nwhere ∥⃗v∥2 =\\npP\\nk v2\\nk assigns the L2 norm of a vec-\\ntor ⃗v. Note that the diagonal elements of dSV (si, si)\\nare equal to zero by definition. However, in Fig.2c\\nthe off-diagonal green component around the diag-\\nonal indicates that the average spending behaviour\\nof a given class is the most similar to neighboring\\nclasses, while dissimilarities increase with the gap be-\\ntween socioeconomic classes. We repeated the same\\nmeasurement separately for the single category of\\ncash purchases (PCG k1).\\nIn this case euclidean\\ndistance is defined between average scalar measures\\nas dk1(si, sj) = ∥⟨SV1⟩(si) −⟨SV1⟩(sj)∥2. Interest-\\ningly, results shown in Fig.2d.\\nindicates that here\\nthe richest social classes appear with a very different\\nbehaviour. This is due to their relative underspend-\\ning in cash, which can be also concluded from Fig.2a\\n(first row). On the other hand as going towards lower\\nclasses such differences decrease as cash usage starts\\nto dominate.\\nTo explain better the differences between socioe-\\nconomic classes in terms of purchasing patterns, we\\nintroduce two additional scalar measures. First, we\\nintroduce the dispersion of individual spending vec-\\ntors as compared to their class average as\\nσSV (sj) = ⟨∥SV k(sj) −SVk(u)∥2⟩u∈sj,\\n(6)\\nwhich appears with larger values if people in a given\\nclass allocate their spending very differently. Second,\\nwe also calculate the Shannon entropy of spending\\npatterns as\\nSSV (sj) =\\nX\\nk∈K2-17\\n−SV k(sj) log(SV k(sj))\\n(7)\\nto quantify the variability of the average spending\\nvector for each class. This measure is minimal if each\\nego of a class sj spends exclusively on the same sin-\\ngle PCG, while it is maximal if they equally spend on\\neach PCG. As it is shown in Fig.2b (light blue line\\n5\\nwith square symbols) dispersion decreases rapidly as\\ngoing towards higher socioeconomic classes. This as-\\nsigns that richer people tends to be more similar in\\nterms of their purchase behaviour.\\nOn the other\\nhand, surprisingly, in Fig.2e (dark pink line with\\nsquare symbols) the increasing trend of the corre-\\nsponding entropy measure suggests that even richer\\npeople behave more similar in terms of spending be-\\nhaviour they used to allocate their purchases in more\\nPCGs. These trends are consistent even in case of\\nk1 cash purchase category (see σSV1(sj) function de-\\npicted with dark blue line in in Fig.2b) or once we in-\\nclude category k1 into the entropy measure SSV17(sj)\\n(shown in Fig.2b with light pink line).\\nTo complete our investigation we characterize the\\neffects of social relationships on the purchase habits\\nof individuals. We address this problem through an\\noverall measure quantifying differences between indi-\\nvidual purchase vectors of connected egos positioned\\nin the same or different socioeconomic classes. More\\nprecisely, we consider each social tie (u, v) ∈E con-\\nnecting individuals u ∈si and v ∈sj, and for each\\npurchase category k we calculate the average absolute\\ndifference of their purchase vector items as\\ndk(si, sj) = ⟨|SVk(u) −SVk(v)|⟩u∈si,v∈sj.\\n(8)\\nFollowing that, as a reference system we generate a\\ncorresponding configuration network by taking ran-\\ndomly selected edge pairs from the underlying social\\nstructure and swap them without allowing multiple\\nlinks and self loops.\\nIn order to vanish any resid-\\nual correlations we repeated this procedure in 5×|E|\\ntimes.\\nThis randomization keeps the degree, indi-\\nvidual economic estimators Pu, the purchase vector\\nSV (u), and the assigned class of each people un-\\nchanged, but destroys any structural correlations be-\\ntween egos in the social network, consequently be-\\ntween socioeconomic classes as well. After generating\\na reference structure we computed an equivalent mea-\\nsure dk\\nrn(si, sj) but now using links (u, v) ∈Ern of the\\nrandomized network. We repeated this procedure 100\\ntimes and calculated an average ⟨dk\\nrn⟩(si, sj). In or-\\nder to quantify the effect of the social network we\\nsimply take the ratio\\nLk(si, sj) =\\ndk(si, sj)\\n⟨dkrn⟩(si, sj)\\n(9)\\nand calculate its average LSV (si, sj) = ⟨Lk(si, sj)⟩k\\nover each category group k ∈K2-17 or respectively k1.\\nThis measure shows whether connected people have\\nmore similar purchasing patterns than one would ex-\\npect by chance without considering any effect of ho-\\nmophily, social influence or structural correlations.\\nResults depicted in Fig.2f and 2g for LSV (si, sj) (and\\nLk1(si, sj) respectively) indicates that the purchas-\\ning patterns of individuals connected in the original\\nstructure are actually more similar than expected by\\nchance (diagonal component).\\nOn the other hand\\npeople from remote socioeconomic classes appear to\\nbe less similar than one would expect from the uncor-\\nrelated case (indicated by the LSV (si, sj) > 1 values\\ntypical for upper classes in Fig.2f).\\nNote that we\\nfound the same correlation trends in cash purchase\\npatterns as shown in Fig.2g. These observations do\\nnot clearly assign whether homophily [12, 13] or so-\\ncial influence [1] induce the observed similarities in\\npurchasing habits but undoubtedly clarifies that so-\\ncial ties (i.e. the neighbors of an ego) and socioeco-\\nnomic status play deterministic roles in the emerging\\nsimilarities in consumption behaviour.\\n6\\nPurchase category correlations\\nTo study consumption patterns of single purchase\\ncategories PCGs provides a too coarse grained level\\nof description. Hence, to address our second ques-\\ntion we use DS2 and we downscale from the category\\ngroup level to the level of single merchant categories.\\nWe are dealing with 271 categories after excluding\\nsome with less than 100 purchases and the categories\\nlinked to money transfer and cash retrieval (for a\\ncomplete list of IDs and name of the purchase cat-\\negories considered see Table 1). As in Section 3 we\\nassign to each ego u a personal vector PV (u) of four\\nsocioeconomic features: the age, the gender, the so-\\ncial economic group, and the distribution r(ci, u) of\\npurchases in different merchant categories made by\\nthe central ego. Our aim here is to obtain an overall\\npicture of the consumption structure at the level of\\nmerchant categories and to understand precisely how\\npersonal and socioeconomic features correlate with\\nthe spending behaviour of individuals and with the\\noverall consumption structure.\\nAs we noted in section 5, the purchase spending\\nvector r(ci, u) of an ego quantifies the fraction of\\nmoney spent on a category ci. Using the spending\\nvectors of n number of individuals we define an over-\\nall correlation measure between categories as\\nρ(ci, cj) =\\nn(P\\nu r(ci, u)r(cj, u))\\n(P\\nu r(ci, u))(P\\nu r(cj, u)).\\n(10)\\n6\\n5211\\n1711\\n5251\\n5533\\n5942\\n2741\\n5943\\n5964\\n4111\\n4011\\n4112\\n4511\\n4722\\n5651\\n5813\\n5947\\n7011\\n4121\\n4131\\n4789\\n5309\\n5331\\n5732\\n5948\\n5993\\n5999\\n7922\\n7991\\n7999\\n9399\\n5691\\n7399\\n4215\\n4784\\n4816\\n5192\\n5399\\n5734\\n5735\\n5811\\n5812\\n5814\\n5968\\n5969\\n5970\\n5992\\n5994\\n7216\\n7230\\n7298\\n7311\\n7392\\n7512\\n7523\\n7542\\n7933\\n7941\\n7996\\n7997\\n8999\\n5967\\n5045\\n5046\\n5065\\n5085\\n5111\\n5995\\n7538\\n4582\\n5200\\n5310\\n5541\\n9311\\n4812\\n7321\\n4899\\n7372\\n7994\\n5945\\n7273\\n5983\\n4900\\n5039\\n5013\\n5072\\n5198\\n5511\\n5532\\n5021\\n5712\\n5231\\n5719\\n5950\\n5733\\n7993\\n5047\\n8011\\n8021\\n8062\\n8071\\n5722\\n5074\\n5094\\n5621\\n5631\\n5699\\n5944\\n5977\\n5131\\n5441\\n5949\\n5122\\n5137\\n5661\\n5139\\n5169\\n5172\\n5193\\n5714\\n7629\\n763\\n5655\\n5641\\n5451\\n5462\\n5973\\n5542\\n7622\\n5599\\n5571\\n5611\\n5935\\n5941\\n5697\\n5681\\n5931\\n5971\\n7296\\n7297\\n7841\\n7832\\n7210\\n7211\\n7932\\n8049\\n5921\\n7929\\n5940\\n5976\\n8641\\n5946\\n7338\\n7221\\n5965\\n7277\\n742\\n7299\\n7998\\n7361\\n8099\\n7995\\n8211\\n8220\\n(a)\\n(b)\\nCar sales and maintenance\\nHardware stores\\nOffice supply stores\\nIT services\\nBooks and newspapers\\nState services and education\\nHome supply stores\\nNewsstand and duty-free shops\\nAmusement and recreation\\nTravelling\\nTransportation and commuting\\nLeisure\\nJewellery and gift shops\\nClothing 1\\nClothing 2\\nPersonal services\\nHealth and medical services\\nFig. 3: Merchant category correlation matrix and graph (a) 163×163 matrix heatmap plot corre-\\nsponding to ρ(ci, cj) correlation values (see Eq. 10) between categories. Colors scale with the loga-\\nrithm of correlation values. Positive (resp. negative) correlations are assigned by red (resp. blue)\\ncolors. Diagonal components represent communities with frames colored accordingly.(b) Weighted\\nG>\\nρ correlation graph with nodes annotated with MCCs (see Table 1). Colors assign 17 communities\\nof merchant categories with representative names summarized in the figure legend.\\n0\\n0.5\\n1\\nfemale male\\n(a)\\n(b)\\nFig. 4: Socioeconomic parameters of merchant categories (a) Scatter plot of AFS(ci) triplets (for\\ndefinition see Eq. 11 and text) for 271 merchant categories summarized in Table 1.\\nAxis assign\\naverage age and SEG of purchase categories, while gender information are assigned by symbols. The\\nshape of symbols assigns the dominant gender (circle-female, square-male) and their size scales with\\naverage values. (b) Similar scatter plot computed for communities presented in Fig.3b. Labels and\\ncolors are explained in the legend of Fig.3a.\\n7\\nThis symmetric formulae quantifies how much peo-\\nple spend on a category ci if they spend on an other\\ncj category or vice versa. Therefore, if ρ(ci, cj) > 1,\\nthe categories ci and cj are positively correlated and\\nif ρ(ci, cj) < 1, categories are negatively correlated.\\nUsing ρ(ci, cj) we can define a weighted correlation\\ngraph Gρ = (Vρ, Eρ, ρ) between categories ci ∈Vρ,\\nwhere links (ci, cj) ∈Eρ are weighted by the ρ(ci, cj)\\ncorrelation values.\\nThe weighted adjacency matrix\\nof Gρ is shown in Fig.3a as a heat-map matrix with\\nlogarithmically scaling colors. Importantly, this ma-\\ntrix emerges with several block diagonal components\\nsuggesting present communities of strongly correlated\\ncategories in the graph.\\nTo identify categories which were commonly pur-\\nchased together we consider only links with positive\\ncorrelations. Furthermore, to avoid false positive cor-\\nrelations, we consider a 10% error on r that can in-\\nduce, in the worst case 50% overestimation of the\\ncorrelation values. In addition, to consider only rep-\\nresentative correlations we take into account category\\npairs which were commonly purchased by at least\\n1000 consumers. This way we receive a G>\\nρ weighted\\nsub-graph of Gρ, shown in Fig.3b, with 163 nodes\\nand 1664 edges with weights ρ(ci, cj) > 1.5.\\nTo identify communities in G>\\nρ indicated by the\\ncorrelation matrix in Fig.3a we applied a graph parti-\\ntioning method based on the Louvain algorithm [28].\\nWe obtained 17 communities depicted with differ-\\nent colors in Fig.3b and as corresponding colored\\nframes in Fig.3a.\\nInterestingly, each of these com-\\nmunities group a homogeneous set of merchant cat-\\negories, which could be assigned to similar types of\\npurchasing activities (see legend of Fig.3b). In addi-\\ntion, this graph indicates how different communities\\nare connected together. Some of them, like Trans-\\nportation, IT or Personal Serv.\\nplaying a central\\nrole as connected to many other communities, while\\nother components like Car sales and maintenance\\nand Hardware St., or Personal and Health and med-\\nical Serv. are more like pairwise connected. Some\\ngroups emerge as standalone communities like Office\\nSupp.\\nSt., while others like Books and newspapers\\nor Newsstands and duty-free Shops (Sh.) appear as\\nbridges despite their small sizes.\\nNote that the main categories corresponding to\\neveryday necessities related to food (Supermarkets,\\nFood St.)\\nand telecommunication (Telecommunica-\\ntion Serv.) do not appear in this graph. Since they\\nare responsible for the majority of total spending,\\nthey are purchased necessarily by everyone without\\nobviously enhancing the purchase in other categories,\\nthus they do not appear with strong correlations.\\nFinally we turn to study possible correlations\\nbetween\\npurchase\\ncategories\\nand\\npersonal\\nfea-\\ntures.\\nAn\\naverage\\nfeature\\nset\\nAFS(ci)\\n=\\n{⟨age(ci)⟩, ⟨gender(ci)⟩, ⟨SEG(ci}⟩) is assigned to\\neach of the 271 categories.\\nThe average ⟨v(ci)⟩of\\na feature v ∈{age, gender, SEG} assigns a weighted\\naverage value computed as:\\n⟨v(ci)⟩=\\nP\\nu∈{u}i αi(vu)vu\\nP\\nu∈{u}u αi(v) ,\\n(11)\\nwhere vu denotes a feature of a user u from the {u}i\\nset of individuals who spent on category ci. Here\\nαi(vu) =\\nX\\n(u∈{u}i|vu=v)\\nr(ci, u)\\nni(vu)\\n(12)\\ncorresponds to the average spending on category ci\\nof the set of users from {u}i sharing the same value\\nof the feature v. ni(vu) denotes the number of such\\nusers. In other words, e.g. in case of v = age and c742,\\n⟨age(c742)⟩assigns the average age of people spent\\non Veterinary Services (mcc = 742) weighted by the\\namount they spent on it. In case of v = gender we\\nassigned 0 to females and 1 to males, thus the average\\ngender of a category can take any real value between\\n[0, 1], indicating more females if ⟨gender(ci)⟩≤0.5\\nor more males otherwise.\\nWe visualize this multi-modal data in Fig.4a as\\na scatter plot, where axes scale with average age\\nand SEG, while the shape and size of symbols corre-\\nspond to the average gender of each category. To fur-\\nther identify correlations we applied k-means cluster-\\ning [29] using the AFS(ci) of each category. The ideal\\nnumber of clusters was 15 according to several crite-\\nria: Davies-Bouldin Criterion, Calinski-Harabasz cri-\\nterion (variance ratio criterion) and the Gap method\\n[30].\\nColors in Fig.4a assign the identified k-mean\\nclusters.\\nThe first thing to remark in Fig.4a is that the av-\\nerage age and SEG assigned to merchant categories\\nare positively correlated with a Pearson correlation\\ncoefficient 0.42 (p < 0.01). In other words, elderly\\npeople used to purchase from more expensive cate-\\ngories, or alternatively, wealthier people tend to be\\nolder, in accordance with our intuition. At the same\\ntime, some signs of gender imbalances can be also\\nconcluded from this plot. Wealthier people appear to\\nbe commonly males rather than females. A Pearson\\ncorrelation measure between gender and SEG, which\\n8\\n742: Veterinary Serv.\\n5072: Hardware Supp.\\n5598: Snowmobile Dealers\\n5950: Glassware, Crystal St.\\n7296: Clothing Rental\\n7941: Sports Clubs\\n763: Agricultural Cooperative\\n5074: Plumbing, Heating Equip.\\n5599: Auto Dealers\\n5960: Dir Mark - Insurance\\n7297: Massage Parlors\\n7991: Tourist Attractions\\n780: Landscaping Serv.\\n5085: Industrial Supplies\\n5611: Men Cloth. St.\\n5962: Direct Marketing - Travel\\n7298: Health and Beauty Spas\\n7992: Golf Courses\\n1520: General Contr.\\n5094: Precious Objects/Stones\\n5621: Wom Cloth. St.\\n5963: Door-To-Door Sales\\n7299: General Serv.\\n7993: Video Game Supp.\\n1711: Heating, Plumbing\\n5099: Durable Goods\\n5631: Women?s Accessory Sh. 5964: Dir. Mark. Catalog\\n7311: Advertising Serv.\\n7994: Video Game Arcades\\n1731: Electrical Contr.\\n5111: Printing, Office Supp.\\n5641: Children?s Wear St.\\n5965: Dir. Mark. Retail Merchant 7321: Credit Reporting Agencies\\n7995: Gambling\\n1740: Masonry & Stonework\\n5122: Drug Proprietaries\\n5651: Family Cloth. St.\\n5966: Dir Mark - TV\\n7333: Graphic Design\\n7996: Amusement Parks\\n1750: Carpentry Contr.\\n5131: Notions Goods\\n5655: Sports & Riding St.\\n5967: Dir. Mark.\\n7338: Quick Copy\\n7997: Country Clubs\\n1761: Sheet Metal\\n5137: Uniforms Clothing\\n5661: Shoe St.\\n5968: Dir. Mark. Subscription\\n7339: Secretarial Support Serv.\\n7998: Aquariums\\n1771: Concrete Work Contr.\\n5139: Commercial Footwear\\n5681: Furriers Sh.\\n5969: Dir. Mark. Other\\n7342: Exterminating Services\\n7999: Recreation Serv.\\n1799: Special Trade Contr.\\n5169: Chemicals Products\\n5691: Cloth. Stores\\n5970: Artist?s Supp.\\n7349: Cleaning and Maintenance\\n8011: Doctors\\n2741: Publishing and Printing 5172: Petroleum Products\\n5697: Tailors\\n5971: Art Dealers & Galleries\\n7361: Employment Agencies\\n8021: Dentists, Orthodontists\\n2791: Typesetting Serv.\\n5192: Newspapers\\n5698: Wig and Toupee St.\\n5972: Stamp and Coin St.\\n7372: Computer Programming\\n8031: Osteopaths\\n2842: Specialty Cleaning\\n5193: Nursery & Flowers Supp.\\n5699: Apparel Accessory Sh.\\n5973: Religious St.\\n7375: Information Retrieval Serv.\\n8041: Chiropractors\\n4011: Railroads\\n5198: Paints\\n5712: Furniture\\n5975: Hearing Aids\\n7379: Computer Repair\\n8042: Optometrists\\n4111: Ferries\\n5199: Nondurable Goods\\n5713: Floor Covering St.\\n5976: Orthopedic Goods\\n7392: Consulting, Public Relations 8043: Opticians\\n4112: Passenger Railways\\n5200: Home Supply St.\\n5714: Window Covering St.\\n5977: Cosmetic St.\\n7393: Detective Agencies\\n8049: Chiropodists, Podiatrists\\n4119: Ambulance Serv.\\n5211: Materials St.\\n5718: Fire Accessories St.\\n5978: Typewriter St.\\n7394: Equipment Rental\\n8050: Nursing/Personal Care\\n4121: Taxicabs\\n5231: Glass & Paint St.\\n5719: Home Furnishing St.\\n5983: Fuel Dealers (Non Auto)\\n7395: Photo Developing\\n8062: Hospitals\\n4131: Bus Lines\\n5251: Hardware St.\\n5722: House St.\\n5992: Florists\\n7399: Business Serv.\\n8071: Medical Labs\\n4214: Motor Freight Carriers\\n5261: Nurseries & Garden St.\\n5732: Elec. St.\\n5993: Cigar St.\\n7512: Car Rental Agencies\\n8099: Medical Services\\n4215: Courier Serv.\\n5271: Mobile Home Dealers\\n5733: Music Intruments St.\\n5994: Newsstands\\n7513: Truck/Trailer Rentals\\n8111: Legal Services, Attorneys\\n4225: Public Storage\\n5300: Wholesale\\n5734: Comp.Soft. St.\\n5995: Pet Sh.\\n7519: Mobile Home Rentals\\n8211: Elem. Schools\\n4411: Cruise Lines\\n5309: Duty Free St.\\n5735: Record Stores\\n5996: Swimming Pools Sales\\n7523: Parking Lots, Garages\\n8220: Colleges Univ.\\n4457: Boat Rentals and Leases 5310: Discount Stores\\n5811: Caterers\\n5997: Electric Razor St.\\n7531: Auto Body Repair Sh.\\n8241: Correspondence Schools\\n4468: Marinas Serv. and Supp. 5311: Dep. St.\\n5812: Restaurants\\n5998: Tent and Awning Sh.\\n7534: Tire Retreading & Repair\\n8244: Business Schools\\n4511: Airlines\\n5331: Variety Stores\\n5813: Drinking Pl.\\n5999: Specialty Retail\\n7535: Auto Paint Sh.\\n8249: Training Schools\\n4582: Airports, Flying Fields\\n5399: General Merch.\\n5814: Fast Foods\\n6211: Security Brokers\\n7538: Auto Service Shops\\n8299: Educational Serv.\\n4722: Travel Agencies\\n5411: Supermarkets\\n5912: Drug St.\\n6300: Insurance\\n7542: Car Washes\\n8351: Child Care Serv.\\n4784: Tolls/Bridge Fees\\n5422: Meat Prov.\\n5921: Alcohol St.\\n7011: Hotels\\n7549: Towing Serv.\\n8398: Donation\\n4789: Transportation Serv.\\n5441: Candy St.\\n5931: Secondhand Stores\\n7012: Timeshares\\n7622: Electronics Repair Sh.\\n8641: Associations\\n4812: Phone St.\\n5451: Dairy Products St.\\n5932: Antique Sh.\\n7032: Sporting Camps\\n7623: Refrigeration Repair\\n8651: Political Org.\\n4814: Telecom.\\n5462: Bakeries\\n5933: Pawn Shops\\n7033: Trailer Parks, Camps\\n7629: Small Appliance Repair\\n8661: Religious Orga.\\n4816: Comp. Net. Serv.\\n5499: Food St.\\n5935: Wrecking Yards\\n7210: Laundry, Cleaning Serv.\\n7631: Watch/Jewelry Repair\\n8675: Automobile Associations\\n4821: Telegraph Serv.\\n5511: Cars Sales\\n5937: Antique Reproductions 7211: Laundries\\n7641: Furniture Repair\\n8699: Membership Org.\\n4899: Techno St.\\n5521: Car Repairs Sales\\n5940: Bicycle Sh.\\n7216: Dry Cleaners\\n7692: Welding Repair\\n8734: Testing Lab.\\n4900: Utilities\\n5531: Auto and Home Supp. St.\\n5941: Sporting St.\\n7217: Upholstery Cleaning\\n7699: Repair Sh.\\n8911: Architectural Serv.\\n5013: Motor Vehicle Supp.\\n5532: Auto St.\\n5942: Book St.\\n7221: Photographic Studios\\n7829: Picture/Video Production\\n8931: Accounting Serv.\\n5021: Commercial Furniture\\n5533: Auto Access.\\n5943: Stationery St.\\n7230: Beauty Sh.\\n7832: Cinema\\n8999: Professional Serv.\\n5039: Constr. Materials\\n5541: Gas Stations\\n5944: Jewelry St.\\n7251: Shoe Repair/Hat Cleaning\\n7841: Video Tape Rental St.\\n9211: Courts of Law\\n5044: Photographic Equip.\\n5542: Automated Fuel Dispensers 5945: Toy,-Game Sh.\\n7261: Funeral Serv.\\n7911: Dance Hall & Studios\\n9222: Government Fees\\n5045: Computer St.\\n5551: Boat Dealers\\n5946: Camera and Photo St.\\n7273: Dating/Escort Serv.\\n7922: Theater Ticket\\n9223: Bail and Bond Payments\\n5046: Commercial Equipment\\n5561: Motorcycle Sh.\\n5947: Gift Sh.\\n7276: Tax Preparation Serv.\\n7929: Bands, Orchestras\\n9311: Tax Payments\\n5047: Medical Equipment\\n5571: Motorcycle Sh.\\n5948: Luggage & Leather St.\\n7277: Counseling Services\\n7932: Billiard/Pool\\n9399: Government Serv.\\n5051: Metal Service Centers\\n5592: Motor Homes Dealers\\n5949: Fabric St.\\n7278: Buying/Shopping Serv.\\n7933: Bowling\\n9402: Postal Serv.\\n5065: Electrical St.\\nTab. 1: Codes and names of 271 merchant categories used in our study. MCCs were taken from the Merchant\\nCategory Codes and Groups Directory published by American Express [23]. Abbreviations corre-\\nspond to: Serv. - Services, Contr. - Contractors, Supp. - Supplies, St. - Stores, Equip. - Equipment,\\nMerch. - Merchandise, Prov. - Provisioners, Pl. - Places, Sh. - Shops, Mark. - Marketing, Univ. -\\nUniversities, Org. - Organizations, Lab. - Laboratories.\\nappears with a coefficient 0.29 (p < 0.01) confirmed\\nit. On the other hand, no strong correlation was ob-\\nserved between age and gender from this analysis.\\nTo have an intuitive insight about the distribution\\nof merchant categories, we take a closer look at spe-\\ncific category codes (summarized in Table 1).\\nAs\\nseen in Fig.4a elderly people tend to purchase in spe-\\ncific categories such as Medical Serv., Funeral Serv.,\\nReligious Organisations, Motorhomes Dealers, Dona-\\ntion, Legal Serv..\\nWhereas categories such as Fast\\nFoods, Video Game Arcades, Cinema, Record St., Ed-\\nucational Serv., Uniforms Clothing, Passenger Rail-\\nways, Colleges-Universities are associated to younger\\nindividuals on average.\\nAt the same time, wealth-\\nier people purchase more in categories as Snowmo-\\nbile Dealers, Secretarial Serv., Swimming Pools Sales,\\nCar Dealers Sales, while poorer people tend to pur-\\nchase more in categories related to everyday neces-\\nsities like Food St., General Merch., Dairy Products\\nSt., Fast Foods and Phone St., or to entertainment as\\nBilliard or Video Game Arcades. Typical purchase\\ncategories are also strongly correlated with gender as\\ncategories more associated to females are like Beauty\\nSh., Cosmetic St., Health and Beauty Spas, Women\\nClothing St. and Child Care Serv., while others are\\npreferred by males like Motor Homes Dealers, Snow-\\nmobile Dealers, Dating/Escort Serv., Osteopaths, In-\\nstruments St., Electrical St., Alcohol St. and Video\\nGame Arcades.\\nFinally we repeated a similar analysis on commu-\\nnities shown in Fig.3b, but computing the AFS on a\\nset of categories that belong to the same community.\\nResults in Fig.4b disclose positive age-SEG correla-\\ntions as observed in Fig.4a, together with somewhat\\n9\\nintuitive distribution of the communities.\\n7\\nConclusion\\nIn this paper we analyzed a multi-modal dataset col-\\nlecting the mobile phone communication and bank\\ntransactions of a large number of individuals living\\nin a single country. This corpus allowed for an in-\\nnovative global analysis both in term of social net-\\nwork and its relation to the economical status and\\nmerchant habits of individuals. We introduced sev-\\neral measures to estimate the socioeconomic status of\\neach individual together with their purchasing habits.\\nUsing these information we identified distinct socioe-\\nconomic classes, which reflected strongly imbalanced\\ndistribution of purchasing power in the population.\\nAfter mapping the social network of egos from mo-\\nbile phone interactions, we showed that typical con-\\nsumption patterns are strongly correlated with the\\nsocioeconomic classes and the social network behind.\\nWe observed these correlations on the individual and\\nsocial class level.\\nIn the second half of our study we detected corre-\\nlations between merchant categories commonly pur-\\nchased together and introduced a correlation network\\nwhich in turn emerged with communities grouping\\nhomogeneous sets of categories. We further analyzed\\nsome multivariate relations between merchant cate-\\ngories and average demographic and socioeconomic\\nfeatures, and found meaningful patterns of correla-\\ntions giving insights into correlations in purchasing\\nhabits of individuals.\\nWe identified several new directions to explore in\\nthe future.\\nOne possible track would be to better\\nunderstand the role of the social structure and inter-\\npersonal influence on individual purchasing habits,\\nwhile the exploration of correlated patterns between\\ncommonly purchased brands assigns another promis-\\ning directions. Beyond our general goal to better un-\\nderstand the relation between social and consuming\\nbehaviour these results may enhance applications to\\nbetter design marketing, advertising, and recommen-\\ndation strategies, as they assign relations between co-\\npurchased product categories.\\nAcknowledgment\\nWe thank M. Fixman for assistance.\\nWe acknowl-\\nedge the support from the SticAmSud UCOOL\\nproject, INRIA, and the SoSweet (ANR-15-CE38-\\n0011-01) and CODDDE (ANR-13-CORD-0017-01)\\nANR projects.\\nReferences\\n[1] A. Deaton, Understanding Consumption. Claren-\\ndon Press (1992).\\n[2] A. Deaton and J. Muellbauer, Economics and\\nConsumer Behavior. Cambridge University Press\\n(1980).\\n[3] T. Piketti, Capital in the Twenty-First Century.\\n(Harvard University Press, 2014).\\n[4] S. Sernau, Social Inequality in a Global Age.\\n(SAGE Publications, 2013).\\n[5] C. E. Hurst, Social Inequality. 8th ed. (Pearson\\nEducation, 2015).\\n[6] J. E. Fisher, Social Class and Consumer Behavior:\\nthe Relevance of Class and Status”, in Advances\\nin Consumer Research Vol. 14, eds. M. Wallen-\\ndorf and P. Anderson, Provo, UT : Association\\nfor Consumer Research, pp 492–496 (1987) .\\n[7] S. Wasserman, K. Faust, Social Network Analy-\\nsis: Methods and Applications. (Cambridge Uni-\\nversity Press, 1994).\\n[8] S. Lohr, The age of big data. (New York Times,\\n2012).\\n[9] D. Lazer, et. al. Computational Social Science.\\nScience 323, 721–723 (2009)\\n[10] A. Abraham, A-E. Hassanien, V. Smasel (eds.),\\nComputational Social Network Analysis: Trends,\\nTools and Research Advances. (Springer-Verlag,\\n2010).\\n[11] P. Bourdieu, Distinction: A Social Critique of\\nthe Judgement of Taste. Harvard University Press\\n(Cambridge MA) (1984).\\n[12] M. McPherson, L. Smith-Lovin, J. M. Cook,\\nBirds of a Feather:\\nHomophily in Social Net-\\nworks. Ann. Rev. Sociol. 27 415–444 (2001).\\n[13] P. F. Lazarsfeld, R. K. Merton, Friendship as a\\nSocial Process: A Substantive and Methodologi-\\ncal Analysis. In Freedom and Control in Modern\\nSociety. (New York: Van Nostrand, 1954) pp. 18–\\n66.\\n10\\n[14] D. B. Grusky, Theories of Stratification and In-\\nequality. In The Concise Encyclopedia of Sociol-\\nogy. pp. 622-624. (Wiley-Blackwell, 2011).\\n[15] P. West, Conspicuous Compassion: Why Some-\\ntimes It Really Is Cruel To Be Kind. Civitas, In-\\nstitute for the Study of Civil Society (London)\\n(2004).\\n[16] T. W. Chang, Social status and cultural con-\\nsumption Cambridge University Press (2010)\\n[17] A. Deaton, The analysis of household surveys: a\\nmicroeconometric approach to development pol-\\nicy. World Bank Publications (1997)\\n[18] Y. Dong, et. al., Inferring user demographics and\\nsocial strategies in mobile social networks. Proc.\\nof the 20th ACM SIGKDD international confer-\\nence on Knowledge discovery and data mining,\\n15–24 (2014)\\n[19] N. Eagle, M. Macy, R. Claxton, Network di-\\nversity and economic development. Science 328,\\n1029–1031 (2010)\\n[20] L. Kovanen, et. al., Temporal motifs reveal ho-\\nmophily, gender-specific patterns, and group talk\\nin call sequences. Proc. Nat. Acad. Sci., 110,\\n18070–18075 (2013)\\n[21] R. Felix, P. A. Rauschnabel, C. Hinsch, Elements\\nof strategic social media marketing: A holistic\\nframework. J. Business Res. online 1st (2016)\\n[22] W. Wood, T. Hayes, Social Influence on con-\\nsumer decisions:\\nMotives, modes, and conse-\\nquences. J. Consumer Psych. 22, 324–328 (2012).\\n[23] Merchant Category Codes and Groups Direc-\\ntory. American Express @ Work Reporting Ref-\\nerence (http://tinyurl.com/hne9ct5) (2008) (date\\nof access: 2/3/2016).\\n[24] P. Martineau, Social classes and spending behav-\\nior. Journal of Marketing 121–130 (1958).\\n[25] D.F. Brown, Social class and Status. In Mey, Ja-\\ncob Concise Encyclopedia of Pragmatics. Elsevier\\np. 953 (2009).\\n[26] P. Saunders, Social Class and Stratification.\\n(Routledge, 1990).\\n[27] V. Pareto, Manual of Political Economy. Reprint\\n(New English Trans) edition (1971).\\n[28] V. Blondel, et. al., Fast unfolding of communi-\\nties in large networks. J. Stat.l Mech: theory and\\nexperiment P10008 (2008).\\n[29] C. M. Bishop, Neural Networks for Pattern\\nRecognition. (Oxford University Press, Oxford,\\nEngland) (1995).\\n[30] R. Tibshirani, G. Walther, T. Hastie, Estimating\\nthe number of clusters in a data set via the gap\\nstatistic. J. Roy. Stat. Soc. B 63, 411-423 (2001).\\n11\\n\\n\\n\\n---\\n\\n\\nThe Masterclass of particle physics and scientific\\ncareers from the point of view of male and female\\nstudents\\nSandra Leone∗\\nINFN Sezione di Pisa\\nE-mail: sandra.leone@pi.infn.it\\nThe Masterclass of particle physics is an international outreach activity which provides an op-\\nportunity for high-school students to discover particle physics. The National Institute of Nuclear\\nPhysics (INFN) in Pisa has taken part in this effort since its first year, in 2005. The Masterclass\\nhas become a point of reference for the high schools of the Tuscan area around Pisa. Each year\\nmore than a hundred students come to our research center for a day. They listen to lectures, per-\\nform measurements on real data and finally they join the participants from the other institutes in a\\nvideo conference, to discuss their results. At the end of the day a questionnaire is given to the stu-\\ndents to assess if the Masterclass met a positive response. Together with specific questions about\\nthe various activities they took part in during the day, we ask them if they would like to become\\na scientist. They are offered 15 possible motivations for a “yes” or a “no” to choose from. The\\ndata collected during the years have been analysed from a gender perspective. Attracting female\\nstudents to science and technology-related careers is a very real issue in the European countries.\\nWith this study we tried to investigate if male and female students have a different perception of\\nscientific careers. At the end, we would like to be able to provide hints on how to intervene to\\ncorrect the path that seems to naturally bring male students towards STEM disciplines (science,\\ntechnology, engineering, and mathematics) and reject female students from them.\\n38th International Conference on High Energy Physics\\n3-10 August 2016\\nChicago, USA\\n∗Speaker.\\nc\\n⃝Copyright owned by the author(s) under the terms of the Creative Commons\\nAttribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).\\nhttp://pos.sissa.it/\\narXiv:1611.05297v1 [physics.ed-ph] 16 Nov 2016\\nMasterclass and scientific careers\\nSandra Leone\\n1. Introduction\\nThe International Masterclasses for Particle Physics (MC) give students the opportunity to be\\nparticle physicists for a day [1]. Each year in spring high school students and their teachers spend\\none day in reasearch institutes and universities around the world. They first attend introductory\\nlectures about particle physics (on the standard model of elementary particles, accelerators and\\ndetectors), then they work as scientists, making measurements on real data collected at CERN by\\nthe LHC experiments. At the end of their research day they experience the international aspect of\\nreal collaborations in particle physics, by presenting their findings in a video linkup with CERN or\\nFermilab and student groups in other participating countries.\\nThe Pisa unit of the National Institute for Nuclear Physics joined the MC since the first year,\\nin 2005 (World Year of Physics) [2]. Each year more than a hundred students 18-19 years old\\nattending the last year (the fifth one) of high school come to our institute. They are selected by\\ntheir schools, taking into account their expression of interest for the initiative and the previous year\\ngrades; in addition, since a few years we ask teachers to reflect the gender distribution of the school\\nin the list of selected students.\\nAt the end of the videoconference a questionnaire is given to the students to assess if the Mas-\\nterclass met a positive response. Approximately 80% of the students taking part to the Masterclass\\nfill the questionnaire. Together with specific questions about the various activities they attended\\nduring the day, we ask them if they would like to become a scientist. The data collected since 2010\\nhave been analyzed from a gender perspective. About 500 students filled the questionnaire, 300\\nmale and 200 female students.\\n2. Analysis of the questionnaire: general part\\nWe ask the students several questions related to the various aspects of the Masterclass: were\\nthe lectures understandable? was your physics background adequate? was the measurement fun?\\nwas the videoconference easy to follow? Then we ask them more general questions: were the Mas-\\nterclass topics interesting? was the Masterclass helpful to better understand what physics is and for\\nthe choise of your future studies? after taking part to the Masterclass, is your interest for physics\\nless, equal, or more than before? is it worth to participate to a particle physics Masterclass?\\nFig. 1 shows an example of the answers to some of the questions, in blue for male students, in\\nred for female students. One can see that the distribution of answers is very similar, for male and\\nfemale students. Fig. 2 (left) shows the only question for which we get a different distribution of\\nthe answers: are you interested in physics outside school? A similar pattern was already observed\\nin a very preliminary study performed on a smaller number of questionnaire in 2010 [3].\\n3. Analysis of the questionnaire: would you like to be a scientist?\\nFinally, we ask the students: would you like to work or do research in a STEM (physics,\\ntechnology, engeneering, and mathematics) discipline? The distribution of their answers is shown\\nin fig. 2 (right). A certain difference between male and female answers is seen.\\n1\\nMasterclass and scientific careers\\nSandra Leone\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nNO \\nPLUS NO PLUS YES \\nYES \\nMale \\nFemale \\nWere the Masterclass topics interesting? \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\nNO \\nPLUS NO PLUS YES \\nYES \\nMale \\nFemale \\nWas the Masterclass useful to understand \\nwhat is physics? \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nLess \\nAs before \\nIncreased \\nMale \\nFemale \\nAfter taking part to the Masterclass your interest \\nfor physics is... \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nNO \\nPLUS NO PLUS YES \\nYES \\nMale \\nFemale \\nWas it worth it to participate? \\nFigure 1: Distribution (in %) of some of the answers given by male and female students.\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\nYES \\nNO \\nMale \\nFemale \\nAre you interested in physics outside school? \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\n100 \\nYES \\nNO \\nMale \\nFemale \\nWould you like to be a scientist? \\nFigure 2: Left: distribution (in %) of the answer to the question: are you interested in physics outside\\nschool? A significant difference between male and female students is seen. Right: answer to the question:\\nwould you like to be a scientist?\\nWe divided the sample in students who declared to be (not to be) interested in physics outside\\nschool, and their answer to the previous question is shown in fig. 3 left (right). Now the two\\ndistributions are very similar, for male and female students.\\nThe students are offered many options to choose from, to motivate their choice, and are asked\\nto select up to a maximum of five reasons for a “yes” or a “no” among the ones listed here.\\nYes because:\\n2\\nMasterclass and scientific careers\\nSandra Leone\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\n100 \\nYES \\nNO \\nMale \\nFemale \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nYES \\nNO \\nMale \\nFemale \\nFigure 3: Distribution (in %) of the answers to the question: would you like to be a scientist? on the left\\n(right) for students interested (not interested) in physics outside school.\\n• It’s s easy to find a job;\\n• I have a talent for science;\\n• I see myself as a scientist;\\n• I like science;\\n• I like to do things that are considered difficult;\\n• I like the idea of studying the mysteries of the universe and finding answers to new questions;\\n• I’m not scared by the idea of working in a lab, without regular meals and hours;\\n• One can make a lot of money in science;\\n• It’s a field where one can travel a lot;\\n• The choice of career has a high priority in my life;\\n• It would make my life more interesting;\\n• I’m not scared by the prospects of an all-encompassing job;\\n• I deeply admire scientists and consider them a role model;\\n• My teachers are encouraging and are advising me to undertake a scientific career;\\n• My family is encouraging me and would be very happy if I were to choose a scientific career.\\nNo, because:\\n• It’s difficult to find a job;\\n• I have no talent for science;\\n• I cannot see myself as a scientist;\\n• I don’t like science;\\n• Scientific disciplines are too difficult;\\n• One has to study too much;\\n• I would like to do more useful work;\\n• Working in a lab without regular meals and hours is not for me;\\n• I put my personal interests first;\\n• I don’t want to sacrifice my personal life for my career;\\n• I aspire to a normal life;\\n• I’m scared by the prospects of an all-encompassing job: I want to have time for myself;\\n• There aren’t scientists who I consider as a model;\\n3\\nMasterclass and scientific careers\\nSandra Leone\\n• My teachers are discouraging me;\\n• My family is discouraging me.\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\nMale \\nFemale \\nYES, because.... \\nFigure 4: Distribution (in %) of the motivations for willing to be a scientist.\\nFrom the distribution of the “yes” motivations, one can notice that more male (about 40%)\\nthan female (about 20%) students think that they have a talent for science. On the other hand, more\\nfemale (about 37%) than male (about 23%) students are attracted by the idea of traveling.\\nThe interpretation of the “no” distribution is affected by large statistical uncertainties, because\\nonly about 70 students answered “no”. However, it is interesting to notice that, among them, 65%\\nof female students feel that they have no talent for science (compared to 40% of male), and a few\\nof them are discouraged by family (while no male student is). In addition, 55% of male students\\nare afraid that in science they’ll not have enough time for themselves (compared to 7% of female\\nstudents).\\n4. Conclusion\\nWe present a preliminary analysis of the answers to about 500 questionnaires filled by students\\nattending the Masterclass of particle physics in Pisa from 2010 to 2016. Looking for differences\\nin answers from male and female students, we notice that almost 80% of male students declare to\\nbe interested in physics outside school, compared to 46% of female students. About 90% of male\\n4\\nMasterclass and scientific careers\\nSandra Leone\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\nMale \\nFemale \\nNO: because ... \\nFigure 5: Distribution (in %) of the motivation for not willing to be a scientist.\\nstudents say that they would like to work in a STEM discipline, compared to about 77% of female\\nstudents.\\nWe plan to continue to distribute this questionnaire to students attending the Masterclass of\\nparticle physics in Pisa and collect more data. In addition, we asked the physics teachers to propose\\nthe general section of the questionnaire concerning scientific careers also to students who will not\\nattend the Masterclass. This will provide a control sample including students not as good as the\\nones coming to the Masterclass and not necessarily interested in science as a career. We aim to\\nbetter understand in which respect male students are more interested in physics outside school than\\nfemale students. At the end, we would like to obtain hints on how to intervene to correct the path\\nthat seems to naturally bring male students towards STEM disciplines and reject female students\\nfrom them.\\nReferences\\n[1] http://physicsmasterclasses.org/\\n[2] http://www.pi.infn.it/ leone/mc/mc2016/\\n[3] G. Chiarelli, S. Leone Le Masterclass come uno strumento per affrontare il gender gap?, presented at\\n“ Comunicare Fisica 2010”.\\n5\\n\\n\\n\\n---\\n\\n\\nDEVELOPMENTS FOR THE ISODAR@KAMLAND AND DAEδALUS\\nDECAY-AT-REST NEUTRINO EXPERIMENTS\\nJOSE R. ALONSO FOR THE ISODAR COLLABORATION\\nMassachusetts Institute of Technology, 77 Massachusetts Avenue,\\nCambridge, MA, 02139, USA\\nConfigurations of the IsoDAR and DAEδALUS decay-at-rest neutrino experiments are de-\\nscribed. Injector and cyclotron developments aimed at substantial increases in beam current\\nare discussed. The IsoDAR layout and target are described, and this experiment is compared\\nto other programs searching for sterile neutrinos.\\n1\\nIntroduction\\nFigure 1 – 8Li neutrino spectrum. Dashed = actual\\nspectrum, Solid = detector response for IBD events\\nDecay-At-Rest (DAR) experiments offer attractive\\nfeatures for neutrino physics studies.1 We discuss\\ntwo particular regimes where the characteristics\\nof the source are determined by the nature of\\nthe weak-interaction decay producing the neutrino,\\nand are not affected by kinematics or characteris-\\ntics of higher-energy production mechanisms. The\\nbeta decay case is manifested in the IsoDAR ex-\\nperiment; a sterile-neutrino search where a 60 MeV\\nproton beam is used to produce the parent isotope,\\n8Li. The product nucleus is stationary when it de-\\ncays, the neutrino spectrum is shown in Figure 1.\\nIt has a high endpoint energy, over 13 MeV, and a mean energy of 6.5 MeV, both substantially\\nhigher than backgrounds from other decays, and in an area easily accessible for detection by\\nInverse Beta Decay (IBD) in a hydrogen-containing neutrino detector.\\nFigure 2 – Neutrino spectrum from stopped\\nπ+. Note absence of ¯νe.\\nIn the regime where pions are produced at low en-\\nergy (with ≤800 MeV protons), pions can stop in the\\ntarget before decaying. This is the case for DAEδALUS,\\na sensitive CP violation measurement. As the nuclear\\ncapture probability for π−at rest in the target is ex-\\ntremely high, the neutrino spectrum from the stopped\\npions will be dominated by the decay of π+ by a fac-\\ntor of about 104. Figure 2 shows the neutrino spectra\\nfrom the π+ →µ+ →e+ decay. Noteworthy in this\\ndecay is the absence of electron antineutrinos, making\\nthis source a favored means of looking for appearance of\\n¯νe, again utilizing IBD in a suitable neutrino detector.\\nThese neutrino sources are isotropic, there is no\\narXiv:1611.03548v1 [physics.ins-det] 11 Nov 2016\\nkinematic directionality to define a beam. As a result, the efficiency of detection is directly\\nrelated to the solid angle subtended by the detector, placing high emphasis on having the source\\nas close to the detector as possible. In the case of IsoDAR this distance is a few meters from\\nthe detector surface (16.5 meters from the center of the KamLAND fiducial volume), in the case\\nof DAEδALUS the baseline is 20 km from the large water-Cherenkov counter (assumed to be\\nHyper-K). As the principal goals of these experiments is oscillation physics, the driving term is\\nL/E, the baseline distance divided by the neutrino energy. If E is low, the baseline L can also\\nbe low to preserve the same ratio. As a consequence, the 20 km baseline and 45 MeV average\\n¯νµ energy addresses the same oscillation point as the 1300 km, 3 GeV DUNE beam, or the 300\\nkm, 500 MeV T2K beam.\\nThe premise of these experiments is that relatively small and compact sources of neutrinos\\ncan be built and installed at the proper distances from existing or planned large water- or\\nliquid-scintillator-based neutrino detectors, providing access to the physics measurements with\\nsubstantially reduced costs.\\nWith respect to the long-baseline experiments (e.g.\\nT2K) the\\nbeamlines from the major accelerator centers operate much more efficiently and cleanly in the\\nneutrino mode, while the DAR measurements, utilizing IBD, address only the anti-neutrino\\nmode. Consequently, installing DAEδALUS cyclotrons at the proper distance from the long-\\nbaseline detectors, and operating the neutrino beams simultaneously, offers a huge improvement\\nin the sensitivity and data rates over the individual experiments. Discrimination of the source of\\nevents is straightforward, both from the energy deposition of events from each source, as well as\\nfrom timing: neutrinos from the cyclotrons are essentially continuous (up to 100% duty factor),\\nwhile those from the large accelerators are tightly pulsed with a very low overall duty factor.\\nNevertheless, the lack of directionality of DAR neutrinos, and the small solid angle between\\nsource and detector calls for the highest-possible flux from the source to ensure meaningful\\ndata rates. Available accelerator technologies and design configurations have been explored,\\nfor beam current performance, cost and footprint; we have arrived at the choice of compact\\ncyclotrons2. The only deficiency of this option is the average current. For appropriate data\\nrates, our specification is 10 mA of protons on target. This pushes the highest current from\\ncyclotrons by about a factor of 3,a and much of the accelerator development work of our group\\nto date has been devoted to addressing the factors that limit the maximum current in compact\\ncyclotrons3,4,5.\\nFigure 3 – Oscillations seen in KamLAND for a 5 year\\nIsoDAR run, for the global fit parameters still consistent\\nwith the IceCube analysis. IBD event rate is about 500\\nper day.\\nIn the next section the physics ratio-\\nnale for the IsoDAR and DAEδALUS exper-\\niments will be briefly described, while subse-\\nquent sections will address the configuration\\nof the cyclotrons, and progress made in push-\\ning the current limits from cyclotrons to the\\nrequired level. The IsoDAR target will be de-\\nscribed, capable of handling the 600 kW of\\nproton beams and optimized for 8Li produc-\\ntion. Finally, the IsoDAR experiment will be\\ncompared with other ongoing initiatives for\\nsearching for sterile neutrinos.\\n2\\nNeutrino Measurements\\n2.1\\nIsoDAR\\naIsotope-producing H−cyclotrons rarely reach 2 mA, the current record-holder for cyclotron current is the\\n3 mA PSI Injector 2, a 72 MeV separated-sector proton cyclotron injecting the 590 MeV Ring Cyclotron.\\nFigure 4 – Sensitivity of 5 year IsoDAR run compared to other ster-\\nile neutrino experiments. DANSS is a reactor experiment in Kalinin\\n(Russia)9;\\n144Ce and 51Cr are the SOX experiment at Borexino\\n(Gran Sasso, Italy)10, PROSPECT is a reactor experiment at HFIR\\nat ORNL (USA)11.\\nAnomalies in ¯νe disappearance rates\\nhave been observed in reactor and\\nradioactive source experiments6. Pos-\\ntulated to explain these has been the\\nexistence of one or more sterile neu-\\ntrinos, that do not in themselves in-\\nteract in the same manner as “ac-\\ntive” neutrinos (hence are called\\n“sterile”), however the active neutri-\\nnos can oscillate through these ster-\\nile states, and in this manner affect\\nthe ratio of appearance and disap-\\npearance from the known three fla-\\nvor eigenstates. Global fits7 of data\\nfrom experiments point to a mass\\nsplitting in the order of 1 to almost\\n8 eV 2, and a sin2(2 θ) of 0.1. Re-\\ncent analysis of IceCube data8, ex-\\nploiting a predicted resonance in the\\nMSW matrix for ¯νµ passing through\\nthe core of the earth appear to rule\\nout ∆m2 values of 1 eV 2 or below, however values above this energy are still possible.\\nThe very large ∆m2 imply a very short wavelength for the oscillations, in fact for the 8Li\\nneutrino it is measured in meters, so within the fiducial volume of KamLAND one could see\\nseveral full oscillations. Folding in the spatial and energy resolutions of the KamLAND detector\\n(12 cm/√EMeV ) and (6.4%/√EMeV ) respectively, the expected neutrino interaction pattern for\\nthe case of ∆m2 = 1.75 eV 2 is shown in Figure 3.\\nFigure 4 shows a sensitivity plot for IsoDAR, this experiment covers very well the regions of\\ninterest for sterile neutrinos.\\n2.2\\nLayout of DAEδALUS Experiment\\nSearch for CP violation in the lepton sector has been a high priority for many years. DAEδALUS\\ncombined with a long-baseline beam (e.g. T2K @ Hyper-K operating in neutrino mode only)\\ncan in 10 years cover almost all of the δ CP-violating phase angles.12\\nFigure 5 – Schematic of the two cyclotrons\\nin a DAEδALUS module.\\nThe injector\\n(DIC - DAEδALUS Injector Cyclotron) also\\nserves as the proton source for IsoDAR. The\\nDSRC (DAEδALUS Superconducting Ring\\nCyclotron) produces protons at 800 MeV.\\nThe experimental configuration includes three sta-\\ntions, each with identical targets that provide neutrino\\nsources (from stopped π+), one at 1.5 km (essentially\\nas close to the detector as feasible) that normalizes the\\nflux seen in the detector, one at 8 km that catches the\\nrise in the ¯νe appearance, and the principal station at\\n20 km, which measures the ¯νe appearance at the peak\\nof the oscillation curve. The absolute appearance am-\\nplitude is modulated by the CP-violating phase. The\\ncurrent on target, hence the neutrino flux, is adjusted\\nsequentially at each station (by “beam-on” timing) to\\nbe approximately equivalent to the flux from the long-\\nbaseline beam. The total timing cycle from all stations\\nallows approximately 40% of time when none are deliv-\\nering neutrinos, for background measurements.\\n3\\nCyclotron Configuration\\nFigure 5 shows schematically the basic configuration of a cyclotron “module” for DAEδALUS,\\nshowing the “chain” of injector-booster cyclotron with a top energy of 60 MeV, and the main\\nDAEδALUS superconducting ring cyclotron (DSRC) which delivers 800 MeV protons to the\\npion-production target. Note that the injector cyclotron is exactly the machine that is needed\\nfor the IsoDAR experiment, so developing this cyclotron is a direct step in the path towards\\nDAEδALUS.\\nTable 1: The most relevant parameters for the IsoDAR and DAEδALUS cyclotrons. IsoDAR has a single\\nstation with one cyclotron, DAEδALUS has three stations, at 1.5, 8, and 20 km from the detector. The\\nfirst two stations have a single cyclotron pair (DIC and DSRC), the 20 km station has two cyclotron pairs\\nfor higher power. Though the total power is high, because the targets are large and the beam is uniformly\\nspread over the target face, the power density is low enough to be handled by conventional engineering\\ndesigns. The DAEδALUS target has a long conical reentrant hole providing a very large surface area.\\nIsoDAR\\nDAEδALUS\\nParticle accelerated\\nH+\\n2\\nH+\\n2\\nMaximum energy\\n60 MeV/amu\\n800 MeV/amu\\nExtraction\\nSeptum\\nStripping\\nPeak beam current (H+\\n2 )\\n5 mA\\n5 mA\\nPeak beam current (proton)\\n10 mA\\n10 mA\\nNumber of stations\\n1\\n3\\nDuty factor\\n100%\\n15% - 50%\\n(time switching between 3 stations)\\nPeak beam power on target\\n600 kW\\n8 MW\\nPeak power density on target\\n2 kW/cm2\\n≈2 kW/cm2\\nAverage beam power on target\\n600 kW\\n1.2 to 4 MW\\nMaximum steel diameter\\n6.2 meters\\n14.5 meters\\nApproximate weight\\n450 tons\\n5000 tons\\nTable 1 lists high-level parameters for the IsoDAR and DAEδALUS cyclotrons. Note the\\npower implication of delivering 10 mA to the production targets.\\nThese very high power-\\nrequirements call for minimizing beam loss during the acceleration and transport process. Any\\nbeam loss is not only destructive of components, but also activates materials and greatly com-\\nplicates maintenance of accelerator systems. Some beam loss is unavoidable, however by appro-\\npriate use of cooled collimators and beam dumps, and by restricting as much as possible these\\nlosses to the lower energy regions of the cyclotrons, the thermal and activation damage can be\\nminimized.\\nThe single biggest innovation in these cyclotrons, aimed at increasing the maximum current,\\nis the use of H+\\n2 ions13 instead of protons or H−. As the biggest source of beam loss is space\\ncharge blowup at low energies, the lower q/A (2 protons for a single charge), and higher mass per\\nion (= 2 amu - atomic mass units) greatly reduces the effects of the repulsive forces of the very\\nhigh charge in a single bunch of accelerated beam. This helps keep the size of the accelerated\\nbunches down so there will be less beam lost on the inside of the cyclotron.\\nKeeping the\\nmolecular ion to the full energy also allows for stripping extraction at 800 MeV/amu, reducing\\nbeam loss in the extraction channels.\\nWhile the size and weight of these cyclotrons may appear large, there are examples of ma-\\nchines of comparable size that can serve as engineering models for beam dynamics, magnetic\\nfield design and costing. The PSI Injector 2, a 72-MeV 3-mA machine models some aspects of\\nthe IsoDAR cyclotron relating to the RF system and space-charge dominated beam dynamics14.\\nMagnet design and steel size/weight bear some similarities to IBA’s 235 MeV proton radiother-\\napy cyclotron15. The DSRC bears significant similarities to the superconducting ring cyclotron\\nat RIKEN16. While this cyclotron is designed for uranium beams, so the beam dynamics are\\nnot directly relevant, the cryostat and magnet designs are extremely close to the DAEδALUS\\nrequirements, and so serve as a good engineering and costing model for the DSRC.\\n4\\nIsoDAR developments\\nAs indicated above, efforts of our group have focused on producing high currents of H+\\n2 for\\ninjection into the IsoDAR cyclotron, modeling the capture and acceleration of these ions, and\\non the design of the target for handling 600 kW of proton beam and maximizing the production\\nof 8Li to generate the ¯νe flux delivered to KamLAND.\\n4.1\\nProducing High Currents of H+\\n2 for Injection\\nExperiments at the Best Cyclotron Systems, Inc. test stand in Vancouver, BC 3 tested the VIS\\nhigh-current proton source17 for its performance in generating H+\\n2 beams. Our requirement\\nfor H+\\n2 is a maximum of 50 mA of continuous beam from the source, which would provide an\\nadequate cushion in the event that capture into the cyclotron cannot be enhanced by efficient\\ntime-bunching of the beam (see next section). The VIS only produced about 15 mA of H+\\n2\\n(while we did measure 40 mA of protons); using this source would require efficient bunching. To\\nincrease our safety margin, a new ion source, labeled “MIST-1” has been built18 based on an\\nLBL-developed filament-driven, multicusp design19 which demonstrated a much more favorable\\np/H+\\n2 ratio, and currents in the range required. This source has been designed with a high\\ndegree of flexibility, to adjust geometric, magnetic field and plasma conditions to optimize H+\\n2\\nperformance. It is now being commissioned.\\n4.2\\nCapturing and Accelerating High Currents of H+\\n2\\nFigure 6 – Low energy injection line and central region of the DIC.\\nA short transport line connects the MIST-1 H+\\n2 ion source with the\\nRFQ buncher, which compresses the beam into packets of about\\n± 15◦. These packets are fed to the spiral inflector (photographed\\nin lower-right), electrostatic deflector plates that bend the beam into\\nthe plane of the cyclotron. The distance from the end of the RFQ\\nto the accelerating dees must be kept to a minium as there is energy\\nspread in the beam and long transport distances will cause the beam\\nto debunch. As a result the RFQ must be installed largely inside\\nthe steel of the cyclotron (pictured in upper right).\\nCyclotrons accelerate beam via RF\\n(radio-frequency, for our cyclotron\\naround 50 MHz) fields applied to\\nelectrodes (called “Dees”) extending\\nalong the full radial extent of the\\nbeam. Particles reaching the accel-\\nerating gap at the right phase of the\\nRF will receive a positive kick, while\\nthose arriving outside this phase an-\\ngle will be decelerated and lost. The\\nphase acceptance of the cyclotron\\nis typically about ± 15◦, so if the\\ninjected beam is not bunched lon-\\ngitudinally, only 10% of a continu-\\nous beam will be accepted.\\nHence\\nthe need for 50 mA of unbunched\\nbeam.\\nBunching is conventionally\\ndone with a double-gap RF cavity\\nplaced about one meter ahead of the\\ninjection point. Maximum efficiency\\nimprovement is no more than a fac-\\ntor of 2 or 3.\\nA novel bunching technique us-\\ning an RFQ was proposed many\\nyears ago20 that could in principle improve bunching efficiency to almost 85%. We have re-\\ncently been awarded funding from NSF to develop this technique, and are working with the\\noriginal proponent, and other key RFQ groups in the US and Europe to build and test this new\\nbuncher. Figure 6 shows schematically the central region of the cyclotron, including the MIST-1\\nsource, the RFQ, and spiral inflector that bunches and bends the beam into the plane of the\\ncyclotron.\\nOnce inflected into the plane of the cyclotron, the beam must be stably captured and ac-\\ncelerated to the full energy and extraction radius (of 2 meters in our case). In addition, there\\nmust be adequate turn separation at the outer radius to cleanly extract the beam. The parti-\\ncles experience 96 turns from injection to extraction, and the radial size of the beam must be\\ncontrolled so that a thin septum can be inserted between the 95th and 96th turns that will not\\nintercept any appreciable amount of beam. With a total of 600 kW, even a fraction of a percent\\nof beam lost on this septum can damage it.\\nFigure 7 – Configuration of IsoDAR on the\\nKamLAND site.\\nExtensive simulations, using the OPAL code21 de-\\nveloped at PSI specifically for beam-dynamics of highly\\nspace-charge-dominated beams in cyclotrons have been\\nused to show that this is possible, and to locate col-\\nlimators and scrapers in the first few turns to control\\nbeam halo (that would be intercepted on the extraction\\nseptum). This code has also shown that space-charge\\nforces can actually contribute to stability of the acceler-\\nating bunch by introducing a vortex motion within the\\nbunch that limits longitudinal and transverse growth of\\nthe bunch22.\\nThese developments give us confidence that the technical specifications for the IsoDAR\\ncyclotron can be met.\\n4.3\\nTarget design\\nThe configuration of the IsoDAR experiment is shown in Fig 7. The cyclotron is located in a\\nvault previously used for water purification, the target is located in one of the construction drifts\\nrepurposed as a control room that is no longer used.\\nFigure 8 – Target/sleeve/shielding structure. The target is 16.5 me-\\nters from the center of the KamLAND fiducial volume. Beam is bent\\n30◦to the target providing shielding for backstreaming neutrons. A\\nwobbler magnet spreads beam out on the 20 cm diameter target face.\\nThe target assembly can be pulled from the back of the structure into\\na casket. This hole is also shielded with removable concrete blocks.\\nThe shielding structure consists of steel and borated concrete.\\nBeam is extracted from the cy-\\nclotron and transported about 50\\nmeters to the target located close to\\nthe KamLAND detector. The 5 mA\\nof H+\\n2 is stripped in this transport\\nline, the resulting 10 mA of protons\\nare directed to the beryllium target.\\nBeryllium is a very efficient neutron\\nproducer, for the 60 MeV proton\\nbeam the yield is approximately 1\\nneutron per 10 protons. These neu-\\ntrons stream through to the sleeve\\nsurrounding the target, containing\\nsmall beryllium spheres (less than 1\\ncm diameter) surrounded by highly-\\nenriched 7Li (99.995%) . The sleeve\\nis a cylinder 50 cm in radius and 2\\nmeters long, and is surrounded by a\\n5 cm graphite reflector. Shielding outside the reflector consisting of iron and borated concrete\\nwhich contains the neutron flux to limit neutrons reaching the rock walls.\\nFig 8 shows the target, sleeve and shielding assembly in relation to the KamLAND detector.\\nThe 8Li yield from the moderated and captured neutrons varies with the fractional composition\\nof beryllium and lithium in the sleeve, the maximum is about 3% (8Li per incident proton on\\ntarget) for 30% (by weight) of lithium. This is close to the interstitial volume of tightly packed\\nspheres. All numbers are based on GEANT4 calculations23.\\nFigure 9 – Section through target and sleeve.\\nFig 9 shows the target assembly, a spun-cast beryl-\\nlium piece with the front surface (where the beam hits)\\nbeing 1.8 cm thick (range of protons is 2 cm, so Bragg\\npeak, at energy too low to efficiently produce neutrons,\\nis in the cooling water, reducing heat load in target.\\nA jet of heavy water is directed to the back surface of\\nthe target in a manner that effectively removes the 600\\nkW of beam power to a heat exchanger. The thermal\\nbehavior of the target is being modeled and will be ex-\\nperimentally tested in the future.\\n5\\nIsoDAR Compared with other Sterile Neu-\\ntrino Experiments\\nTable 2 compares the IsoDAR experiment with two\\nother sterile-neutrino search experiments, SOX10 and\\nDANSS9.\\nSensitivity comparisons were given in Figure 4, the table highlights some of the\\nrationale for the significantly higher sensitivity of IsoDAR.\\nTable 2: Comparison of IsoDAR with SOX, the 144Ce experiment at Borexino, and DANSS, a represen-\\ntative reactor experiment. Relative sensitivities of these three experiments were shown in Fig. 4\\n.\\nIsoDAR\\nSOX\\nDANSS\\nSOURCE\\n8Li\\n144Ce\\nFuel burning\\nSpectral purity\\nClean β spectrum\\nClean β spectrum\\ncomplex, with anomalies\\nRate stability\\nStable, dependent\\non accelerator\\nDecays with\\n285 day halflife\\nChanges with\\nfuel aging\\nEnergy of ¯νe\\nflux maximum\\n8.5 MeV\\n3.4 MeV\\n3.5 MeV\\nDETECTOR\\nKamLAND\\nBorexino\\nSolid scintillator\\nVolume\\n900 tons\\n100 tons\\n<10 tons\\nNeutron bkgnd\\nManageable\\nshield design\\nManageable\\nshield design\\nDifficult to shield, limits\\nproximity to core\\nCosmic bkgnd\\n(rock overburden)\\n2700 MWE\\n3400 MWE\\nshallow,\\nhigh muon rates\\nIn summary, IsoDAR is a very compelling experiment for the search for sterile neutrinos, but\\nbecause of the high event rates and excellent statistics, the reach of physics for this extremely\\nshort baseline configuration extends to non-standard interactions, spectral shape and other\\nneutrino-characterization experiments as well. The challenging technologies for producing the\\nhigh-power beams and optimizing neutrino production are being developed at a steady pace,\\never increasing the feasibility of these experiments.\\nAcknowledgments\\nWork supported by the US National Science Foundation under Grant No. NSF-PHY-1505858,\\nand by the MIT Bose Foundation.\\nReferences\\n1. A. Bungau, etal, Phys. Rev. Lett. 109, 141802 (2012)\\n2. A. Adelmann, etal, arXiv:1210.4454 [physics.acc-ph]\\n3. J.R. Alonso, etal, arXiv:1508:03850 [physics.acc-ph]\\n4. D. Winklehner, etal, arXiv:1507.07258 [physics-acc-ph]\\n5. J.J. Yang, etal, Nucl. Instrum. Methods A 704, 84 (2013)\\n6. G. Mention, etal, Phys. Rev. D 83, 073006 (2011)\\n7. C. Giunti, M. Laveder, Phys. Lett. B 706, 200 (2011), arXiv:1111.1069 [hep-ph]\\n8. G.H. Collin, C.A. Arg¨uelles, J.M Conrad, M.H. Shaevitz, Phys. Rev. Lett. (in press);\\narXiv:1607.00011 [hep-ph]\\n9. M. Danilov, arXiv:1412.0817 [physics.ins-det]\\n10. O. Smirnov, etal, Physics Procedia 61, 511 (2015)\\n11. J. Ashenfelter, etal, arXiv:1309.7647 [physics,ins-det]\\n12. C. Aberle, etal, arXiv:1307-2949 [physics.acc-ph]\\n13. L. Calabretta, etal, accelconf.web.cern.ch/AccelConf/p99/PAPERS/THP139.PDF\\n14. A.M. Kolano, etal, accelconf.web.cern.ch/AccelConf/IPAC2014/papers/tupri031.pdf\\n15. E. Syresin, etal, accelconf.web.cern.ch/AccelConf/IPAC2011/papers/weps085.pdf\\n16. K. Yamada, etal, accelconf.web.cern.ch/AccelConf/e08/papers/thpp069.pdf\\n17. L. Celona, etal, Rev. Sci. Instrum. 75, 1423 (2004)\\n18. S. Axani, etal, RSI 87, 02B704 (2016)\\n19. K.W. Ehlers, K-N. Leung, Rev. Sci. Instrum. 54, 677 (1983)\\n20. R.W. Hamm, etal, accelconf.web.cern.ch/AccelConf/c81/papers/ec-03.pdf\\n21. A. Adelmann, etal, accelconf.web.cern.ch/AccelConf/ICAP2009/papers/we3iopk01.pdf\\n22. J. Jonnerby, D. Winklehner (Private communications)\\n23. A. Bungau, etal, arXiv:1205,5790 [physics-acc-ph]\\n\\n\"}\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "Tool Calls:\n", " arvix_search (ada300e1-831b-4f90-a236-5d1f14b624eb)\n", " Call ID: ada300e1-831b-4f90-a236-5d1f14b624eb\n", " Args:\n", " query: AI regulation June 2022 figure three axes\n", " arvix_search (71a46374-befb-46d8-9e3d-41d9b01ceec7)\n", " Call ID: 71a46374-befb-46d8-9e3d-41d9b01ceec7\n", " Args:\n", " query: Physics and Society August 11 2016\n", "=================================\u001B[1m Tool Message \u001B[0m=================================\n", "Name: arvix_search\n", "\n", "{\"arvix_results\": \"\\nFairness in Agreement With European Values: An\\nInterdisciplinary Perspective on AI Regulation\\nAlejandra Bringas Colmenarejo1, Luca Nannini2, Alisa Rieger3, Kristen M. Scott4, Xuan\\nZhao5, Gourab K. Patro6, Gjergji Kasneci7, and Katharina Kinder-Kurlanda8\\n1University of Southampton, United Kingdom\\n2Minsait - Indra Sistemas and CiTIUS, Universidade de Santiago de Compostela, Spain\\n3Delft University of Technology, Netherlands\\n4KU Leuven, Belgium\\n5SCHUFA Holding AG and University of Tuebingen, Germany\\n6IIT Kharagpur, India and L3S Research Center, Germany\\n7SCHUFA Holding AG and University of Tuebingen, Germany\\n8Digital Age Research Center, University of Klagenfurt, Austria\\nJuly 5, 2022\\nAbstract\\nWith increasing digitalization, Artificial Intelligence (AI) is becoming ubiquitous. AI-based\\nsystems to identify, optimize, automate, and scale solutions to complex economic and societal\\nproblems are being proposed and implemented. This has motivated regulation efforts, including\\nthe Proposal of an EU AI Act. This interdisciplinary position paper considers various concerns\\nsurrounding fairness and discrimination in AI, and discusses how AI regulations address them,\\nfocusing on (but not limited to) the Proposal. We first look at AI and fairness through the\\nlenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various per-\\nspectives. Then, we map these perspectives along three axes of interests: (i) Standardization vs.\\nLocalization, (ii) Utilitarianism vs. Egalitarianism, and (iii) Consequential vs. Deontological\\nethics which leads us to identify a pattern of common arguments and tensions between these\\naxes. Positioning the discussion within the axes of interest and with a focus on reconciling the\\nkey tensions, we identify and propose the roles AI Regulation should take to make the endeavor\\nof the AI Act a success in terms of AI fairness concerns.\\n1\\nIntroduction\\nAI applications have grown at an unprecedented rate in recent years and have become ubiquitous in\\nour society. While often deployed with the intention to increase efficiency and fairness of decision-\\nmaking, AI has also sparked many debates on (un)fairness [101]. These debates surround, amongst\\nothers, unfair treatment of individuals and groups due to the reproduction of systemic, institu-\\ntional, and societal biases in AI decisions [14]; the opacity of AI decisions [4]; diverse jeopardies\\nto democracy and societal well-being [74]; risks to consumer privacy [63]; and market inequalities\\nthat are observed in the aggregation of unprecedented levels of power of big companies that develop\\nAI systems (Big Tech) while small and new companies are struggling to enter the market [100].\\nIn many fields of AI application, such as policing, justice, and recruitment, bias and unfairness as\\n1\\narXiv:2207.01510v1 [cs.CY] 8 Jun 2022\\ndescribed above should not only be mitigated to increase fairness but in fact, to avert violating\\nprotected human rights.\\nThe above mentioned undesired effects and consequences of AI application and development\\npropelled the European Union for new regulations, ex-ante reviews, and ex-post monitoring on AI\\nsystems. The European Union intends to assert the AI Regulation through the protection of human\\ndignity and fundamental rights with the Proposal of the Artificial Intelligence Act [85], convinced\\nthat human beings should remain at the center of technological development. However, to make this\\nendeavor of the AI Act a success, to some extent divergent interdisciplinary views and perspectives\\non bias, fairness, and regulation, have to be taken into consideration.\\nWe elaborate on legal, industrial, sociotechnical, and philosophical perspectives in light of iden-\\ntified axes of tension in the debate on AI fairness and regulation: Standardization vs. Localiza-\\ntion, Utilitarianism vs. Egalitarianism, and Consequential vs. Deontological. Further, we discuss\\ndiscrepancies between how these perspectives are addressed in the current Proposal of the Artificial\\nIntelligence Act and make recommendations how they could be addressed for better reconciliation\\nwith all three perspectives and the legal requirements. In sum, we make the following contributions\\nto the ongoing discourse on AI fairness and regulation: i. Interdisciplinary perspectives: Com-\\nprehensive interdisciplinary (technical, legal, industrial, sociotechnical, philosophical) discussion of\\nbias, fairness, and regulation (Sections 2 to 6), ii. Mapping tensions of debate: mapping the dif-\\nferent perspectives on fairness in AI applications and regulation on to three axes that reveal tensions\\nin the debate: Standardization vs. Localization, Utilitarianism vs. Egalitarianism, and Consequen-\\ntial vs. Deontological ethics (Section 7), iii. Path forward: Recommendations towards consensus\\nfor a successful AI Act that reconciles divergent perspectives (Section 8).\\n2\\nTechnical Frameworks for Bias and Fairness in AI\\nIn this section we present examples of fairness controversies for selected AI application domains with\\nhigh-stake consequences. Subsequently, we discuss several AI fairness notions and present research\\non guidance to choose between these notions and between measures to mitigate bias in AI systems.\\n2.1\\nExamples of Bias and Unfairness in AI Applications\\nAutomated decision-making systems were suggested to be capable of increased fairness due to avoid-\\nance of human bias interference [52]. However, many cases have come to light in which automatic\\ndecision-making was found to raise critical issues regarding fairness, and reproduces systemic, in-\\nstitutional, and societal biases. Such biases can result in discrimination, unfairness, and issues of\\nprivacy, thus, violating protected human rights (see Section 3). This is especially harmful when au-\\ntomated decision making has high-stake implications for individuals and society. In the following,\\nwe present salient examples.\\nIn Policing and Justice, AI systems are applied across Europe to inform and assist day-\\nto-day police work by profiling people, attempting to predict likely future behavior or locations\\nof future crimes, and assessing the alleged risk of criminal involvement of individuals (e.g., Top\\n600 criminals list and CAS (Netherlands), Delia (Italy), SKALA (Germany). Outcomes of these\\npredictions and assessments are used to justify surveillance, searches, or questioning of alleged high\\nrisk individuals. However they have been suspected to reinforce existing patterns of offending and\\nenforcement [1, 99]. In the judicial arena, automated decision-making is currently being applied\\nin various courts around the world to support certain tasks, such as risk assessment of recidivism,\\nas well as decisions concerning bail amounts, probation periods, and sentencing [94, 116]. Across\\nEurope, such systems are not yet used widely, however, they have been introduced or tested in\\n2\\nsome countries, e.g., in Spain (RisCanvi) or the UK (HART). Završnik [116] highlights potentially\\nviolated rights due to opaque, automated decision-making in the justice system, e.g., the right to a\\nfair trial, the principle of non-discrimination and equality, and the right for explanation.\\nAI systems are further being applied in the domain of Education and Employment, to sup-\\nport candidate selection for higher education admissions and recruitment, e.g., with CV screening,\\ntargeted job advertisement, candidate sourcing, and video screening [2]. The risk of bias has been\\ndemonstrated at each of these stages in the recruitment process [11, 56].\\nIn Finance and Banking, AI algorithms constitute the basis of numerous different applications,\\nsuch as market forecasting for trading, or risk management for credit scoring, loan allocations, and\\nmortgage rates [15]. Various cases have come to light in which decisions of such applications were\\nfound to be unfair and biased towards minority borrowers, i.e., with higher mortgage and loan\\nrejection rates for Hispanic and Black borrowers in the US [8, 29], or lower credit limits for women\\nthan for men with equal credit relevant characteristics [36, 106].\\nFor Online Platforms, AI based recommender systems are applied to support users to navigate\\nthe web by filtering information and suggest items (videos, social media content, products, music,..)\\npredicted to be relevant for the user. Recommender systems were found to amplify different kinds\\nof bias, such as representation bias with an over-representation of male, white, and young users [96],\\nand exposure bias where the top 20% of businesses get 80% of the exposure [87], and marketplaces\\npreferentially recommend their own products [21].\\nThis amplifies substantial power imbalances\\nbetween market-dominating platform incumbents (Big Tech) and smaller platforms who do not have\\naccess to equal vast amounts of high-quality consumer data that is vital to enter the market [100].\\nThe resulting immense power concentration in the private hands of very few companies that develop\\nmost AI applications and prioritize profit over benevolence for society poses an additional threat\\nto democracy and society [28, 105]. Further, recommender systems and search result rankings that\\noften optimize to capture attention, determine a large extent of the information to which people are\\nexposed. This can result in distorted exposure to information and viewpoints, as well as exposure\\nto dis- and misinformation, raising issues of fairness and posing a threat to democracies that are\\nreliant on well-informed citizens who can engage in healthy political and social discourse [42, 74].\\nAI systems could threaten democracy and society further by undermining the process of elections\\nthrough targeted advertisements. Such microtargeting provides tools for interference by malicious\\npolitical actors [23, 73].\\n2.2\\nMitigating Bias and Ensuring Fairness\\nMost fairness definitions consider either group or individual fairness. Group fairness is focused on re-\\nquiring that people who belong to protected groups receive on average the same treatment/outcome\\nas the overall population, expressed as the equality of a selected statistical measure across groups [111],\\nsuch as statistical parity, demographic parity, equal opportunity and equality of odds. Individual fair-\\nness focuses on ensuring that any two individuals who are similar except for the protected features\\nreceive equal or similar treatment/outcomes [26]. While ideally, multiple fairness notions would\\nbe met to reach a complete fairness status, this is impossible due to mathematical incompatibili-\\nties between them [68]. Criteria to systematize the procedure of selecting between fairness notions\\nwhen making a specific decision have been proposed: Amongst others, the existence of a ground-\\ntruth, base-rates between sub-groups, the cost of misclassification, or the existence of government\\nregulations to meet may be considered [62].\\nFormalization of fairness definitions in a specific context is nuanced and it is important that AI\\npractitioners receive some guidance when designing a fair AI system. Some recent research proposes\\nthe Fairness Compass, a schema in form of a decision tree which simplifies the selection process\\n3\\nby settling for the desired ethical principles in a formalised way [98].\\nA standardized roadmap\\ncould potentially make the identification of an appropriate fairness definition a more straightfor-\\nward procedure, and help document the decision process toward fairness. Audit, monitoring and\\nexplanation might then be more accessible and less expensive. Nevertheless, there should also be\\nspace for stakeholders with deeper understanding of the specific context to contribute refinement\\nand interpretations of any such roadmap.\\nThe fairness notions mentioned above deal with the outcome of automated decision-making.\\nCounterfactual fairness [55] and causal fairness [112], however, have a procedural implication which\\nmight be more suitable for the cases where a counterfactual or causal connection needs to be\\nestablished between features. Most of the existing fairness notions are formalized in a static scenario.\\nIf we want to better understand how bias is encoded in historical data or evaluate the consequences\\nof certain fairness intervention, dynamic fairness notions [20] might offer a better solution.\\nTechnical methods to mitigate bias in algorithms fall under three categories: (1) Pre-processing.\\nPre-processing techniques try to transform/re-balance the data so that the underlying discrimina-\\ntion is mitigated; (2) In-processing. The construction of objective function usually has Utilitarian\\nmotivation behind, e.g. trying to maximize the utility of whole population. In-processing methods\\nfor bias mitigation can be used either by incorporating changes into the objective function or impos-\\ning a fairness constraint; (3) Post-processing. Post-processing methods reassign the labels initially\\npredicted by the black-box model to a fairer state. [66].\\nThe existing technical solutions toward fairness focus on more consequential approaches: the\\noutcome/decision is evaluated by a specific fairness notion and then measures are taken to correct\\nthe unfair outcome/decision. Concerns have been voiced that fairness cannot be simply achieved\\nthrough mathematical formulation approaches as the formalism trap [103] and the seeming success\\nof these technical solutions in the end will hinder pursuits of actual fairness with the cooperation\\nof social practices [39].\\n3\\nA Legal Perspective on Bias and Fairness in AI\\nTo follow one central goal of the EU—the promotion of peace and well-being for its members—\\nEU law aims at ensuring that EU member-states and individuals are treated and treat each other\\nequally and fairly. The blindfolded Justicia further emphasizes the importance of laws that promote\\nfairness, but also fairness within the enforcement of all laws. Decision-making based on machine-\\nlearning could be a promising support for that, to mitigate the unconscious or deliberate biases that\\nwe as humans have. However, being trained on (biased) data from previous decisions, the promise\\nof unbiased assessments could not be fulfilled so far [5, 46].\\nIn this section, we will take a structured look at the legal perspective on bias and fairness in AI.\\nWe will start with an overview of EU legislative framework on non-discrimination and the approach\\nto fairness followed by the EU Data Protection Law. Then we will conclude by addressing the\\ntechnical requirements to deal with bias that would be introduced with the AI Regulation Proposal.\\n3.1\\nNon-Discrimination Law\\nThe general principle of non-discrimination in EU law protects people from discrimination and unfair\\ntreatment. European anti-discrimination law is designed to prevent discrimination against particular\\ngroups of people that share one or more characteristics—called protected attributes—and from which\\nthe group acquires the category of a protected group. Concretely, protected attributes under the\\nCharter of Fundamental Rights of the European Union include sex, race or ethnic origin, colour,\\nethnic or social origin, genetic features, religion or other belief, disability, age, sexual orientation,\\n4\\npolitical or any other opinion, language, membership to a national minority, property, social origin,\\nand birth (Art. 21.(1)) [82]. Additionally, the Charter prohibits discrimination on the grounds of\\nnationality, compels the European Union to ensure the equality of everyone under the European law,\\ndemands the respect of cultural, religious, and linguistic diversity, and seeks equality of men and\\nwomen in all areas. Several other European anti-discrimination directives have further covered the\\nlegal protection offered to these protected attributes. Specifically, under the European Legislation\\nmen and women must receive equal treatment in the labour market and regarding the access and\\nsupply of good as services[79, 83]. Likewise, equal treatment must be guaranteed between persons\\nirrespective of their racial or ethnic origin [78], as well as equity shall be respected in employment\\nand occupation in regards to the grounds of disability, religion or belief, age and sexual orientation\\n[77]. Member States expanded the protection towards discrimination through specific national laws\\nand provisions.\\nFurthermore, the European legislation presents two tools to address discrimination, direct and\\nindirect discrimination. Direct discrimination is defined as a situation in which one person is treated\\nless favourable on the grounds of a prohibited criterion than another is, has been or would be treated\\nin a comparable situation [78]. Thus, it is straightforwardly related to the possession of a protected\\nattribute that distinguishes the person from other individuals, regardless of the intention behind\\nthe disparate treatment or the mere existence of less favourable treatment. In the context of data-\\ndriven systems, direct discrimination will cover those cases where the model is not neutral towards\\na protected attribute and offers a less favourable output to individuals on the basis of protected\\ngroups, whether they truly fit into that group or are associated with the protected attribute. Since\\nconsciously inputting discrimination into the model will affect its accuracy, these cases are not of\\ngreat concern [113].\\nBy contrast, indirect discrimination will more likely capture many situations of algorithmic dis-\\ncrimination because it affects situations where an apparently neutral provision, criterion or practice\\nwould put members of a protected category at a particular disadvantage compared with other persons\\nunless that provision, criterion or practice is objectively justified by a legitimate aim and the means\\nof achieving that aim are appropriate and necessary [78]. Nevertheless, the prohibition of indirect\\ndiscrimination does not encompass a set of clear and easily applicable rules, it can rather be con-\\nsidered closer to a standard than to a rule [118]. The concept of indirect discrimination results in\\nrather open-ended standards, which are often difficult to apply in practice. It needs to be proven that\\na seemingly neutral rule, practice or decision disproportionately affects a protected group [118]. Due\\nto this, indirect discrimination concerns neutral models, which in principle are blinded to sensitive\\nattributes or do not operate on the basis of those protective attributes. Thus, direct discrimination\\nfocuses on individual cases of discrimination, while indirect discrimination deals with rules and\\npatterns of discrimination and can reveal underlying social inequalities.\\n3.2\\nData Protection Law\\nThe European Union General Data Protection Regulation (GDPR) [84] refers to automated indi-\\nvidual decision-making and seeks, amongst other objectives, to prevent algorithmic discrimination.\\nGenerally, the GDPR states the objective to protect all the fundamental rights recognised under\\nEU law, which the processing of personal data may challenge. According to the GDPR, the core\\nprinciples that shall lead the processing of personal data are lawfulness, fairness, and transparency.\\nConcretely, the principle of fairness entails the processing of personal information that is not in any\\nway unduly detrimental, unexpected, or misleading to the individuals concerned ([48]). Indeed, the\\nprinciple of fairness seeks to protect the individual’s fundamental rights and freedoms, and so, their\\nnon-infringement by such processing. Likewise, the principle of data accuracy requires the control of\\n5\\nthe quality of data for its processing, although it does not address the possible wrongful or dispropor-\\ntionate selection of data and therefore the effect and consequences resulted from such selection [76].\\nTo ensure fair processing, the GDPR requests the use of appropriate mathematical and statistical\\nprocedures for profiling that take into account the risks involved for the interest and rights of data\\nsubjects and prevent discriminatory effects on natural persons (Recital 71 [84]). Furthermore, the\\nGDPR highlights the potential risks to the rights and freedom of natural persons, which could lead to\\nphysical, material or non-material damage, in particular when processing results in discrimination\\n(Recital 75 [84]). Despite these provisions, ensuring fairness is still quite a subjective matter as it\\nrequires that the data processing shall not exceed reasonable expectations nor provoke unjustified\\nadverse effects on the individuals. However, what can be considered reasonable expectations and\\njustifiable effects is an open question, leaving the notion of fair processing undefined.\\nHowever, the European anti-discrimination law evidently embedded notions of substantive dis-\\ncrimination and therefore, unjustified algorithmic discrimination, as referred to in Article 5 and\\nRecital 71, implies unfair processing [38]. From the legal perspective, discrimination collides with\\nequality, infringing the principle of fairness; whereas from a technical perspective, algorithmic dis-\\ncrimination straightforwardly entails unfair processing (see Section 2).\\n3.3\\nEU Artificial Intelligence Regulation Proposal\\nWith the EU Artificial Intelligence Act the European Union aims at laying down harmonized rules\\non artificial intelligence with four specific objectives [85]:\\n1) ensure that AI systems placed on the\\nUnion market are safe and respect existing law on fundamental rights and Union values; 2) ensure\\nlegal certainty to facilitate investment and innovation in AI; 3) enhance governance and effective\\nenforcement of existing law and safety requirements applicable to AI systems; 4) facilitate the de-\\nvelopment of a single market for lawful, safe and trustworthy AI applications preventing market\\nfragmentation.\\nIn essence, the Proposal seeks to balance legal certainty and the development of AI systems\\nwhile ensuring an approach that respects European values, principles and laws. The specific pur-\\npose of the Proposal is to establish a classification for trustworthy AI systems based on a risk-based\\napproach, to introduce new legal obligations and requirements on public authorities and businesses\\nfor the development and application of AI systems, to prohibit harmful AI-enabled practices, and\\nto set new monitoring and enforcement regimes. Essentially, the Proposal will set a legal framework\\napplicable for developers and end-users of AI systems which specific characteristics—opacity, com-\\nplexity, dependency on data, autonomous behaviours—can adversely affect a number of fundamental\\nrights enshrined in the EU Charter of Fundamental Rights [85].\\nThe Proposal delimits a set of prohibited AI practices considered harmful because they contra-\\nvene EU values and violate fundamental rights. Second, the Proposal outlines specific obligations\\nto avoid the appearance of bias in two types of high-risk AI systems; (1) those which are intended\\nto be used as a safety component of a product or is itself a product, and this product is subject to\\nan existing third-party conformity assessment, and (2) those which are involved in decision-making\\nprocesses in the following areas; (i) biometric identification and categorization of natural persons,\\n(ii) management and operation of critical infrastructure, (iii) education and vocational training, (iv)\\nemployment and workers management as well as access to self-employment, (v) law enforcement,\\n(vi) migration, asylum, and border control management, and (vii) administration of justice and\\ndemocratic processes (see Section 2.1).\\nAccording to the Proposal, AI systems can only be placed into the EU market if they comply with\\nthe certain minimum requirements specified in the legislation, requirements that become stricter as\\nthe risk associated with the system increases (i.e., minimal risk, low risk, high risk, and unacceptable\\n6\\nrisk). Consequently, providers will need to carry out ex-ante conformity assessments and implement\\nquality and risk management systems and post-market monitoring to ensure compliance with the\\nnew regulation and minimise the risk for users and affected persons. However, the Proposal pays\\nlittle attention to identifying the causes and proposing recommendations to tackle the potential\\ndiscriminatory harms of AI systems. Specifically, the Proposal mainly focuses on biases in data\\nsets, forgetting other types such as those that may arise from the choice of algorithms, and the\\noptimization or evaluation of metrics. Additionally, the Proposal may pose unreasonable trust in\\nhuman operators—i.e., human in the loop—to identify and recognise cases of bias and discrimination\\nin AI systems.\\nThe Proposal does not provide detailed guidance on dealing with unavoidable trade-offs for\\nthe different stakeholders when debiasing and monitoring bias in the data set. Nevertheless, some\\ninsights can be found in the Proposal regarding the expected requirements to debias high-risk\\nAI systems.\\nFirstly, there will be an obligation to establish appropriate data governance and\\nmanagement practices concerning the training, validation, and testing of data sets, in particular, to\\nexamine possible biases, ensure the relevance, representativeness, absence of errors and completeness\\nof the data sets, and their consideration with the characteristics or elements that are particular to\\nthe specific geographical, behavioural or functional setting within which the high-risk AI system is\\nintended to be used [85]. Secondly, a novel exception to the Data Protection Regulation will allow\\nto the extent that it is strictly necessary for the purposes of ensuring bias monitoring, detection and\\ncorrection in relation to the high-risk AI systems [85] the processing of special categories of data.\\nFinally, the Proposal asks for developing methods that will ensure the detection of biased outputs\\nand the consequent introduction of appropriate mitigation measures as it recognises the potential\\nof AI systems to develop biased outputs due to outputs used as an input for future operations, i.e.,\\nfeedback loops.\\nInterestingly, the Proposal also details the role of standards and specifications in the AI landscape\\n[85]. On the one hand, the Proposal addresses the use of harmonised standards to presume the\\nconformity of AI systems with the regulation’s requirements. On the other hand, the Proposal\\nentitles the Commission with the duty to adopt common specifications and technical solutions\\nwhen the harmonised standards are insufficient or there is a need to address specific or fundamental\\nrights concerns. In other words, conformance with technical standards and common specifications\\nshould give providers of high-risk AI a level of confidence that they are compliant with the mandatory\\nrequirements of the proposed EU AI Regulation as well as significantly cutting the cost of compliance\\nfor business [65].\\nWhereas neither the standards nor the specifications will be compulsory for\\nproviders of high-risk AI systems, their non-adoption shall entail a justification as to which and\\nwhy other technical solutions were adopted.\\n4\\nAn Industry Perspective on Bias and Fairness in AI\\nSubstantial research on ML fairness, even for industry applications, has originated out of academic\\ncontexts. Academic research has first proposed most fairness principles and quantitative methods\\nto mitigate biases and unbalanced data with general application domains [6, 59, 66]. Toolkits ap-\\npeared ready to be integrated for the industry, even if often developed following non-contextual\\ndesign rationales based upon the issues of algorithmic methods [43]. Until recently, the technical\\nnature of academic contributions have often not addressed the practical issues that industry prac-\\ntitioners face when adopting and engaging with fairness tools. Practitioners have pointed out the\\nlack of ethical tools’ usability in real-world applications due to a series of critical factors preventing\\nthe straightforward adoption of fairness principles and methods [69]. Following Morley et al. [71],\\n7\\nsuch non-effectiveness in real-world cases stems from how fairness compliance is operationalized\\ninside companies. If not developed with the sociotechnical features and constraints of AI product\\ndeployment in mind, these methods could easily lead to failures [43] including for example fair-\\nness definitions misinterpretation [54], obfuscation of practitioners’ accountability [81], and gaming\\nfairness measures as a method of ethics-washing [71]. To avoid shortcomings, researchers are now\\nfocusing on how to operationalize fairness frameworks based on the needs of industry practitioners.\\nVeale et al. [110] conducted interviews with decision makers in high-stakes public-sector contexts.\\nPractitioners were found to be lacking incentives and practices for algorithmic accountability due\\nto resource constraints and dependency on prior infrastructure. Holstein et al. [44] enlarged the\\npool of industry practitioners with a systematic investigation of ML product development. Amid\\nthe area of intervention were identified issues of data quality provenance and reporting, as well as\\nthe need for domain-specific educational resources and compliance protocols, intended specifically as\\ninternal auditing processes and tools for fairness-focused debugging. Rakova et al. [92] reported that\\npractitioners often felt a hostile organizational environment where they were hindered or uncom-\\npensated when trying to implement fairness practices independently. Disincentive stems from the\\nlack of educational programs, rewards, accountability allocation, and communicative protocols over\\nfairness issues, especially when different parts of an AI development are distributed across different\\nteams. This resulted in practitioners often feeling disoriented, unprepared, or even overwhelmed by\\nfairness tools and checklists [19, 44]. It was also observed that practitioners recommend establishing\\ninternal and external investigation committees to create an inclusive and preventive environment\\nand to provide resources such as protocols or educational teams [61, 92]. Other research examples,\\nonce informed on practitioners’ needs, focused on designing different AI fairness solutions: check-\\nlists to be aligned with teams’ workflows and organizational ad-hoc processes, fairness frameworks\\nor internal algorithmic auditing protocols designed for industrial applications [61, 91]. Recently,\\nRichardson and Gilbert [97] proposed a complete industry framework of stakeholders and fairness\\nrecommendations while specifying operationalization pitfalls. Ibáñez and Olmeda [47] distinguished\\ntwo main perspectives on operationalizing fairness practices in organizations: a bottom-up, reactive\\napproach, where prior organizational processes restrain best practices, or top-down, where a proac-\\ntive approach is set in place according to the translation of principles and methods as actionable,\\niterative steps designed with stakeholders’ needs and concerns in mind. Interestingly, the literature\\nagrees that fairness interventions should not be standardized and reactive to prior single instances of\\norganizational infrastructure issues, but proactive, based on a thorough understanding of different\\nstakeholders’ needs, and accounting for domain-specific and contextual factors.\\nIn regards to the Proposal, it is not yet clear how fairness practices will be effectively opera-\\ntionalized given the mechanisms envisioned in Articles 43 and 61 from the Proposal, respectively\\nfor conformance checking and post-market monitoring of high-risk systems.\\nFor those systems,\\nproviders will be demanded to draft and verify their conformance through a quality management\\nsystem, technical documentation, and post-market monitoring under the lens of a national body.\\nThis body will be guided by a national supervisory authority in coordination with the EDPB (Eu-\\nropean AI Board from the EU commission). Yet, some detractors, in line with some concerns over\\norganizations’ ethics washing, advanced skeptical doubts on the procedural efficacy of these audit-\\ning mechanisms [60, 64]. Doubts were related to the undisclosed nature of conformity declarations\\nas well as the nature of contributions of data criteria input to the EU database for stand-alone\\nhigh-risk AI systems in Article 60, withheld from the scrutiny of those affected by such systems\\nand available only upon regulatory bodies’ request. This loose gravity towards the public interest\\nmight not permit to enforce EU citizen fundamental rights to decide whether a system should be\\nlisted as high-risk. In light of the concerns for more structural fairness practices, the evolution of an\\noverly rigid and costly compliance environment could critically undermine these needs. An official\\n8\\nimpact assessment has been proposed [95] to quantify these costs. Mueller [72] advanced an analysis\\nof the economic costs that could arise for EU small and medium enterprises and corporations. In\\nthe forecast, effects will push away venture capital investors, drain European talents and tighten\\nstronger external dependencies leading to a highly unfavorable European environment, with the risk\\nof being excluded from the global AI market. Academics and policy analysts have advanced a de-\\nbate on the validity of those claims, picturing less-burdening assessments over quality management\\nsystems, thus calling the report factitious [37, 57]. Future predictions will need to account both\\nfor amendments to the terminology and procedures. Foremost, central analysis focus should be\\ngiven to the ecosystem of digital technology regulations that the EU has on its agenda [80]. These\\ndigital Proposals constitute the European intention of enforcing its legislative sovereignty and set\\nstandards for the international market. Leveraging the Brussels Effect [12, 31] and the current rise\\nof AI ethics attention across a wide range of institutional and academic stakeholders [35, 102], it is\\nreasonable to predict that in the near future current investments in integrating fairness governance\\npractices could be streamlined into more mature and efficient regulatory frameworks with lower\\nprocedural costs while mitigating reputational risks [92].\\n5\\nA Sociotechnical Perspective on Bias and Fairness in AI\\nRegarding AI fairness and discrimination, many have pointed out that AI is not merely a tool, it is\\na sociotechnical endeavour, meaning that the development, use of (and harm from) AI technologies\\ncan not be separated from their specific social contexts [27, 90].\\nWhen attempting to prevent\\nharm from technologies we must look closely at a new technology’s actual capacities and functions\\nwithin these contexts. An over-emphasis of the role of specific technological features of AI in either\\ncausing, or preventing, discrimination, for example, can obscure other forms of discrimination that\\nare occurring, as well as lead to an unproductive and ultimately distracting focus on fixing or\\nregulating those specific features [33, 90].\\nVeale and Borgesius [109] make a similar argument in regards to the Proposal.\\nThey cite\\nthe examples of the prohibition against releasing AI systems that use subliminal or subconscious\\ntechniques to distort a person’s behaviour and argue that this focus on evocative, ripped from the\\nheadlines potential harms does little to mitigate actual harms and adds little to existing legislation\\n[109]. Issues include, for instance, that prohibition only covers manipulative systems that cause\\nindividual harm but not a collective harm or a harm that arises from dynamics of the user-base\\nentwined with an AI system [109] and that there must be intent to distort behaviour. Dourish\\nand Bell [25] identified a similar phenomenon surrounding the discussion and implementation of\\nubiquitous computing technologies and contrast the myth used to build visions of technologies and\\nthe messiness of the practical implementation of technologies in reality.\\nThey further describe\\nubiquitous computing researchers as explaining away limitations and unexpected consequences of\\nspecific systems by referring to a proximate future where the given technology will be fully realized\\nand highly useful, as soon as a few remaining kinks (such as unevenly distributed infrastructure,\\nfor example) are ironed out [25].\\nIn the case of the messy realities of AI, it is widely acknowledged that it is non-trivial to build\\nerror-free models and good quality data within the context of societal factors and power structures\\nat play [18, 27, 67]. To give a specific example, data workers who are frequently manually labeling,\\ncleaning, and enriching the data used for training AI models, have a crucial role in the development\\nof AI systems and their practices are subject to a myriad of non-objective influences [67]. Similarly,\\nthe harms often identified with AI use online, such as hyper-personalization, invasion of privacy, and\\nspread of hate speech can stem from issues beyond the technology, such as monopolies, data power\\n9\\nimbalances, and un-checked corporate crime [24]. Some have argued that those aspects of online\\nlife are a requisite feature of an emerging economic system that has grown out from the existing\\ncapitalist economic system [117].\\nTherefore, we must acknowledge the systemic sources of the discrimination when mitigating\\ndiscriminatory harm of AI technologies and the discussion of the impact of such technologies should\\nstart at an earlier point. In particular, we must look at the specific setting of a given case. This\\nincludes considering what specific sociopolitical goals a given AI system is enforcing. For example,\\nin Austria, a risk assessment algorithm created for use in the public employment system has been\\ndescribed as guided by a philosophy of neo-liberal austerity in the social sector which has been\\nreplacing the concept of the European welfare state [3]. We must also consider where the discussions\\nare happening, who is involved in the discussions, and how the population is able to discuss and\\nenforce whether an AI in a domain should be used at all. In regards to the Proposal, according\\nto [109], there is evidence of industry influence in high level policy decision-making surrounding the\\ncurrent Proposal.\\nAnother complication in regulating and mitigating harm from AI is the complexity of determin-\\ning how, or if, it is possible to distinguish between AI decisions and human decisions. If we do not\\nacknowledge these entanglements, there is a risk of bias being addressed with overly mechanistic ap-\\nproaches. In reference to the example of privacy ethics, Nissenbaum [75] has described how a focus\\non the very attempt to mitigate privacy concerns by ever more sophisticated anonymization meth-\\nods can lead to overlooking other issues, such as algorithms that do not infringe on privacy, yet are\\nstill harmful. Similarly, a focus on attempting to operationalize a very specific concept of fairness,\\nand to regulate specific methods for monitoring it, risks pulling awareness from other algorithmic\\nharms, or even obfuscating underlying causes of harm [7, 90]. In the case of the Austrian AMS,\\ndescribed above, the controversy of a proposed algorithm opened up a whole discussion about how\\na Public Employment System should be run overall. From the perspective of power aware analysis\\n[67] everyone affected needs to be involved in those decisions.\\n6\\nA Philosophical Perspective on Bias and Fairness in AI\\nWe also look at developments in AI and algorithmic fairness through the lens of moral philosophy,\\nspecifically normative ethics [49], which essentially investigates the question of whether something is\\nmorally right or wrong. There are two major schools of thought in normative ethics; (i) Deontological\\nethics argues the existence and significance of inherent rightness of an action (examples include\\nKant’s categorical imperative [86], and Rawls’ veil of ignorance [93]); (ii) Consequentialism judges the\\nmorality of an action based on the value it brings (examples include welfarism [51], hedonism [70]).\\nWhile our deontological views inform the building blocks of morality in today’s society (e.g., EU\\nfundamental rights), consequential approaches enjoy scalability through the use of representative or\\nproxy metrics in real-world usages (e.g., cost-benefit analysis [58] or per-capita income in economics,\\nand overall accuracy in machine learning as discussed in Section 2). Traditional AI research often\\nfollows a declarative approach where a mathematical objective is designed and optimized while\\ncaring less about the decision-making process and its correctness or representativeness [13, 16,\\n29].\\nSuch an approach can be argued to be a consequentialist’s approach to AI whereby only\\nthe optimization of final objective matters and the end justifies the procedure.\\nHowever, this\\napproach has received a lot of critique within the AI domain, and a range of issues have been pointed\\nout; for example concerning causality [17, 34], fairness [29, 66], explainability [13], including the\\ncomparability and robustness of explanations [88, 89], and trustworthiness [107].\\nAnother angle from which AI developments can be looked at, is Welfarism [51] (a type of\\n10\\nconsequentialism), which suggests choosing the action that maximizes the welfare or well-being of\\nthe population. In fact, it is widely used in some areas of economics, game theory, social-choice\\ntheory, and applications. Welfarism is often studied in two major forms; (i) Utilitarianism [104]\\nemphasizes maximizing the welfare of the population; (ii) Egalitarianism argues for equality often\\nleading to a form of Rawlsian justice [93] which comes under deontological ethics, but its objective\\nform in welfarism tries to maximize the welfare of the worst-off. Utilitarianism is found to be heavily\\nembedded in today’s society. For example, the optimization objectives (loss functions) in machine\\nlearning are often the aggregate errors over the set of data points or the individuals, i.e., utilitarian\\nin nature. Utilitarian social welfare is quite prevalent in economics, computational social choice\\n(allocation, voting, etc.)1. Such utilitarian objectives tend to optimize for the overall utility while\\nmay be best-serving the majority and poorly serving the minority populations. This is one of the\\nreasons due to which the usual loss-minimizing objectives have been found to be unfair in many\\napplications including criminal justice, banking, and gig-economy. On the other hand, egalitarian\\nwelfarism in machine learning would likely try to equalize the errors of all or groups of individuals\\ninstead of minimizing the aggregate errors.\\nIn fact algorithmic fairness notions like individual\\nfairness [26], equal opportunity and equality of odds [40], equal mistreatment [114] are either inspired\\nby or promote egalitarian views in consequential modeling (error represents a consequence). These\\nnotions have been found to reduce the effects of pre-existing biases in data and to improve the\\nutilities of marginalized groups under algorithmic decision-making systems.\\nA few recent works have also explored non-consequential or deontological approaches to algo-\\nrithmic fairness. These works can be grouped into two categories. (1) Works on procedural fairness\\n[30, 32] argue that it is essential for the chosen design and principles to be socially acceptable.\\nThus, these works focus on understanding how people assess fairness and ways to infer societal\\nexpectations about fairness principles thereby accounting for all voices in designing fair decision-\\nmaking systems. For example, Grgić-Hlača et al. [32] propose a framework for procedural fairness\\nby evaluating the moral judgments of humans regarding the use of certain features and accordingly\\ndesigning decision-making systems. (2) Another set of works argue for causal and counterfactual\\nfairness, i.e., addressing unfair causal effects of sensitive attributes in the decision-making process\\n[17, 55]. Instead of focusing on the outcome alone, these works have explored deontological aspects\\nand propose to ensure fairness in the decision-making process.\\n7\\nMapping Perspectives\\nWe have identified three axes along which different perspectives in the debate about AI Regulation\\nfor preventing unfairness and discrimination fall. These axes may or may not be orthogonal, i.e.,\\nthey may have relationships with each other. In the following sections, we define the axes and\\ndescribe the debate surrounding regulating fairness in AI represented by each axis. These are not\\nall of the axes of debate, rather these are salient tensions that we have identified. We find them\\nhelpful in conceptualizing and mapping the values and desiderata of the perspectives we are focusing\\non.\\n7.1\\nAxis-1: Standardization vs. Localization\\n7.1.1\\nThe axis\\nThis first axis of interest addresses the differences between standardization and localization. Stan-\\ndardization entails the process of making things of the same type all have the same basic features\\n1Nash social welfare [50] is an exception.\\n11\\nStandardized\\nLocalized\\nEgalitarian\\nUtilitarian\\nConsequential\\nDeontological\\nSubjective\\nHard to deploy\\nInclusive\\nObjective\\nEasy to deploy\\nMajoritarian\\nFigure 1: Three identified axes along which the debate about regulation of fairness in AI falls; Here they\\nare aligned along high-level characterizations and common arguments made for, and against, each of the\\nends of the identified axes.\\n(from Cambridge dictionary), specifically, through the creation of protocols to guide the design, de-\\nvelopment, and creation of such goods or services based on the consensus of all the relevant parties\\nin the industry. Standardization is intended to ensure that all the goods and services produced\\nrespecting those protocols come with the same or equivalent quality, safety, interoperability and\\ncompatibility. For this reason, multiple parties need to be involved in developing such protocols\\nand standards, namely, manufacturers, sellers, buyers, customers, trade associations, users or regu-\\nlators (https://www.iso.org/standards.html). By contrast, localization describes the process of\\nmaking a product or a service more suitable for a particular country, area, etc. (from Cambridge\\ndictionary). In essence, localization entails adapting the product or service to the characteristics of\\na given culture, region, or society.\\n7.1.2\\nPros and cons\\nIn the context of AI, advocates for and members of industry frequently cite standardization as\\na method for preventing or mitigating discrimination [41, 53, 108]. In this respect, high-risk AI\\nsystems will be presumed to comply with the requirements established in the AI Proposal if they\\nare, as well, in conformity with the harmonised standards published by the Official Journal of the\\nEuropean Union as referred to in article 40 [85]. Likewise, high-risk AI systems in conformity with\\nthe specifications referred to in Article 41 of the AI Proposal will be presumed in conformity with\\nthe regulation [85]. In this sense, conformity with standards and specifications as proposed in the AI\\nRegulation will allow the entry of high-risk AI systems in the European market while guaranteeing\\nagreed levels of quality and safety that ensure the adherence to European principles and values (i.e.,\\nnon-discrimination, fairness, and human dignity).\\nA dilemma regarding standardization, however, appears when there is a disagreement regarding\\nthe standard of fairness that should be used to assess AI systems. As presented in Section 2.1 the\\nstraightforward example of incompatible fairness standards referred to the case of COMPAS and\\nthe different standards followed by ProPublica [5] and Northpoint [22] for their fairness assessments,\\ni.e., disparate mistreatment and calibration respectively [118]. Moreover, overly specific and strict\\n12\\nstandards and frameworks risk encoding a biased, restrictive, non-relevant to everyone, singular\\nworldview, and may ultimately lead to uniformization from a top-down approach Section 4. In\\ntruth, standardarization as a method to enforce fairness can in some cases overlook the root-causes\\nof bias, setting standards and notions of fairness that do not offer a real solution to the intrinsic dis-\\ncrimination or biases in certain situations or contexts Section 5. A—purely hypothetical—example\\nof this problem would be the hard-coded requirements for gender parity in school admissions or\\nhiring where there was a low representation of one of the genders, e.g., due to relocation for work\\nreasons or armed conflicts. The solution would be to establish an acceptable ratio of males to females\\nset at a level appropriate to the local context, rather than a strict gender parity requirement.\\nIn this regard, localizing AI systems entails the process of making them local in character by\\nlimiting the ethics regulation and specifics of enforcement to the desired area. Whereas the complete\\nlocalization of AI systems will be in conflict with the embedded values of the AI Regulation (e.g.,\\nEuropean Common Market and European Fundamental Rights), the localization of some of the\\ndecisions regarding their design, development, or deployment may allow a more tailored approach\\nto address AI discrimination and biases in specific geographical, cultural, or sociotechnical contexts.\\nThe localization of some requirements and technical solutions may, as well, allow for the definition\\nof ethical and legal guidelines that address the specific circumstances of a community, local area, or\\nsector beyond the general standards and specifications.\\n7.2\\nAxis-2: Utilitarian vs. Egalitarian\\n7.2.1\\nThe axis\\nThe second axis of interest addresses differences between utilitarian and egalitarian views. While\\na utilitarian philosophy is one of maximizing the overall welfare of the population, egalitarianism\\naims for equality amongst all those people.\\n7.2.2\\nPros and cons\\nUtilitarianism has long been argued to be in conflict with the certain conceptualizations of fairness\\n(see Chapter 14 of Hooker [45]). In the context of AI, algorithms are often designed to optimize\\nfor certain mathematical objectives (which can be categorized as a declarative approach).\\nThe\\nobjective functions in machine learning tasks usually measure a form of aggregate accuracy over a\\npopulation, which fits the definition of a utilitarian measure. Optimizing solely for such a measure in\\nAI applications risks optimizing the utility of the whole population while hurting minority groups in\\nmany [40, 114]. Utilitarian approaches are so ingrained in the computing research and development\\nmindset that the early group fairness notions—which are supposed to mitigate the discriminatory\\neffects of utilitarian objectives—such as demographic parity, had been reduced to utilitarian forms\\nby constraining over the aggregate benefits or outcomes of groups of individuals [115]. The literature\\nhas now moved on to notions such as individual fairness, equal opportunity, and treatment parity\\nwhich, even though outcome-based, are more egalitarian in nature.\\nDespite its obvious conflicts with fairness, and egalitarianism’s close connection with fairness,\\nutilitarian welfare is often cited a necessary factor in system and policy design. In fact, protecting\\nthe EU’s economic interests is stated as a goal of the AI Act [85]. Since utilitarianism captures\\na certain overall efficiency of a system (accuracy in machine learning, utilitarian welfare in eco-\\nnomics), its goals often reflect business-oriented metrics of AI applications (i.e., click-through rate\\nfor recommendations in online marketplaces, or success-rate of ranked workers on gig-economy plat-\\nforms). However, there might be a trade-offbetween maximizing efficiency and achieving other\\n13\\nsocial objectives like equity or fairness in cases of inherent imbalance in the data or population\\n[9, 10].\\n7.3\\nAxis-3: Consequential vs. Deontological\\n7.3.1\\nThe axis\\nThis third axis of interest from the discussions in Sections 3 to 6 represents the differences between\\nconsequential and deontological ethics. Deontological ethics argue for the existence of the inherent\\nrightness of an action, while consequential ethics evaluate morality based on the consequences of an\\naction.\\n7.3.2\\nPros and cons\\nTechnical measures for mitigating AI based discrimination tend to focus on fairness notions, whereby\\na fairness constraint is often added to the original objective. Fairness in this case is defined by\\nstatistical properties of the outcome/decision of the system (e.g., demographic parity). Fairness\\nnotions thus seek to reduce harm by adjusting or influencing the outcome to fit some statistical\\ndefinition of fairness. While the motivation for doing this may be based on deontological principles\\nof equality, this approach belies a consequentialist definition of fairness, wherein one declares that\\nfairness has been achieved through an equality in outcome, such as equal amount of good (accurate)\\nand bad (inaccurate) outcomes for each group.\\nDeontological ethics is often given as an opposite to consequentialism. A deontological approach\\nargues for the existence and significance of the inherent rightness of an action; in the context of\\nAI based discrimination, this would suggest that the approach described above does not meet the\\ncriteria of acting morally, as the focus is on shifting the outcome. From a deontological perspective,\\nan AI system is unlikely to be fair if the development of AI itself is not driven by essential guiding\\nprinciples, such as fairness.\\nThe Proposal’s prohibition of certain uses is based on deontological principles of protecting\\nfundamental individual rights. However, the risk based approach could be viewed as consequential,\\nin that it only targets systems used in contexts perceived as being highly consequential. This means\\nthat many AI systems which might exhibit harmful representational or discriminatory biases, such\\nas social media and online platforms are relieved of any requirements.\\nSummary: Based on the pattern of high-level characterizations and common arguments made\\nfor, and against, each end of the identified axes, we place them along a single axis, with one end\\ncontaining localized, deontological, egalitarian approaches (LED) and the other end containing\\nstandardized, utilitarian, consequential approaches (SUC); we illustrate this mapping in Figure 1.\\nThe LED end contains approaches that purport to acknowledge systemic and complex causes of\\ndiscrimination and are often criticized as being overly subjective and hard to deploy. The approaches\\non the SUC end purport to be objective and easy to implement while often being critiqued as\\nfailing to recognize systemic causes or ensure inclusion of minority voices. This mapping of the\\nperceived benefits and shortcomings of each approach allows us to identify a key tension in the\\ndebate on regulating fairness in AI. It is one that is based on differing understandings of the nature\\nof bias and discrimination, along with differing priorities as to what constitutes practicality and\\nimplementability in efforts to increase fairness. Following this, we suggest how the Proposal could\\nbetter balance these values, as well as the differing perspectives of stakeholders, to achieve the stated\\ngoal of guaranteeing agreed levels of quality and safety in accordance with European principles and\\n14\\nvalues (i.e., non-discrimination, fairness, and human dignity) without creating major hurdles for the\\nEuropean AI Industry.\\n8\\nKey Agreement and A Path Forward\\n8.1\\nKey Agreement\\nWe see a specific agreement amongst the presented perspectives, regarding limitations of the current\\nregulation. Ultimately each of the perspectives agree that regulation needs to be grounded in the re-\\nality of the context of the use of AI, and that this is not sufficiently achieved in the Proposal. A brief\\nsummary of these previously discussed realities that the Proposal as not sufficiently accounting for\\nis as follows: 1. lack of agreement on what technology like AI really is and what are its capabilities,\\n2. cost and complexity for a business to follow the required regulations, 3. the known limitations\\nof debiasing techniques and explanations of black boxes, 4. lack of specifications on how to best\\nimplement human oversight in the context of AI systems, 5. varied and shifting notions of fairness\\nwithin society, 6. impact of power imbalances (eg. technological divide, data power, company size,\\nand market share) on the creation and enforcement of and ability to comply with the Proposal.\\n8.2\\nA Path Forward: Balancing Perspectives\\n8.2.1\\nStandardization and Localization\\nStandardization may facilitate the translation of fundamental rights, i.e., right to fairness, into\\nstandards and specifications to be followed and complied with by all AI actors with the aim of\\nensuring that AI systems do not discriminate nor mistreat individuals.\\nLikewise, localization may allow the clarification of deontological values in more specific and\\nconcrete requirements, metrics, or assessments, particular to each enforcement context. This is to\\nprevent a top-down enforcement of operationalizations of fairness that are untenable, or even unfair,\\nin some contexts. For example, in Section 4 we have summarized the literature demonstrating that\\nensuring fairness compliance from AI industry could as well be served from a more localized approach\\nto operationalizing fairness. This does not imply the relativization of the legal and ethical principle\\nof fairness but, on the contrary, take into account the wider scenario beyond the purely technical\\nnature of AI and strengthen the enforcement of fairness during the whole life cycle of AI.\\nProposed role of AI Regulation\\nStandardization should be used to the extent that the measure\\nhas a direct link to upholding the deontological value of fairness. In order to ensure the principle\\nof universalization, though, special care must be taken to build in flexible localization allowances.\\n8.2.2\\nUtilitarian and Egalitarian\\nIt may be possible to maintain an egalitarian approach to AI Regulations, while also taking ad-\\nvantage of the potential benefits of utilitarian measures. For example, to promote equality (i.e.,\\nbring in egalitarianism) all stakeholders could be given sufficient power to provide inputs on how\\nto maximize and measure their welfare. Any decisions about utilitarian measures would then be\\nbased on this input. Note that increased awareness of the use of AI systems and their implica-\\ntions toward fairness among the responding individuals (stakeholders) is essential for a successful\\nprocess. This approach would, again, bring up the question of standardization versus localization.\\nSpecifically, how highly localized measures would be required to adequately account for the policy\\nexpectations of all individuals in an egalitarian fashion. To address this, we would defer to the\\n15\\nprinciples suggested in Section 8.2.1. Extensive work is needed to determine how best to implement\\nsuch a process, but some of the open questions may be best left answered by the inclusive input\\nprocess itself.\\nProposed role of AI Regulation\\nThe specific framework for how to obtain and incorporate\\nstakeholder inputs should be laid out. A way needs to be found to enforce that all stakeholders\\nhave sufficient power and influence in AI Regulation decision making processes and that they are\\nthemselves sufficiently aware of the potential adverse implications of AI technology.\\n8.2.3\\nDeontological and Consequential\\nThe EU’s stance on fairness is deontological, in that fairness is justified by itself, with no direct\\nsubordination to its eventual outcomes. What matters is whether the action is motivated by duty\\n(respect of the moral law: dignity and universalization). However, expectations of individuals on the\\nspecifics of what constitutes freedom, equality, and dignity, may vary across cultures, geographies,\\nand contexts. This has led digital and human rights groups to highlight that AI policies should\\nempower individuals, communities, and organisations to contest AI-based systems and to demand\\nredress when they themselves determine that their fundamental rights have been violated [7].\\nThe Proposal itself is not intended to legislate individual rights; that is intended to be covered\\nin other laws of the European legal framework. With that in mind, the Proposal could still enforce\\nan individual’s need to be informed and to understand the impacts. Therefore transparency, ex-\\nplainability of the design, development and implementaion of AI systems, as well as their output,\\nremains paramount. There must also be understandable and effective methods for stakeholders to\\nadjust the specific standards, such as what uses are forbidden, in the case of unforeseen use cases\\nand impacts or of the recognition of previously ignored violations of the European principles.\\nProposed role of AI Regulation\\nRequirements such as documentation and transparency should\\nspecifically serve stakeholders’ needs to understand the implications of AI systems for their specific\\nsituation, life, and work.\\n9\\nConclusion\\nIn this position paper, we presented technical, legal, industrial, sociotechnical, and (moral) philo-\\nsophical perspectives on the debate on fairness in AI systems with a particular focus on the Proposal\\nof the EU AI Act. We identified a pattern of common arguments representing a key tension in the\\ndebate with one side containing deontological, egalitarian, localized approaches and the other side\\ncontaining standardized, utilitarian, consequential approaches. We discussed how different (sym-\\nbolic) ends of the axes could be reconciled and proposed the following roles that the AI Regulation\\ncould take to successfully address these tensions: (1) apply standardization to uphold deontolog-\\nical values, but ensure universalization by including flexible localization allowances; (2) lay out\\na framework to incorporate stakeholder inputs and ensure that they are sufficiently aware of po-\\ntential adverse implications of AI technology; and (3) design requirements of documentation and\\ntransparency so that they serve the needs of stakeholders.\\nAcknowledgements: This work has received funding from the European Union’s Horizon 2020\\nresearch and innovation programme under Marie Sklodowska-Curie Actions (grant agreement num-\\nber 860630) for the project “NoBIAS - Artificial Intelligence without Bias” and (grant agree-\\nment number 860621) for the project “NL4XAI - Natural Language for Explainable AI”. The\\n16\\nauthors would like to thank all the panelists of European AI Regulation Week 2021 (https:\\n//aia-week.pages.citius.usc.es/) for their helpful comments on AI regulation. This work re-\\nflects only the authors’ views and the European Research Executive Agency (REA) is not responsible\\nfor any use that may be made of the information it contains.\\nReferences\\n[1] Angelika Adensamer and Lukas Daniel Klausner. 2021. “Part Man, Part Machine, All Cop”:\\nAutomation in Policing. Frontiers in Artificial Intelligence 4 (2021), 29.\\nhttps://doi.org/\\n10.3389/frai.2021.655486\\n[2] Edward Tristram Albert. 2019. AI in talent acquisition: A review of AI-applications used in\\nrecruitment and selection. Strategic HR Review 18, 5 (2019), 215–221.\\nhttps://doi.org/\\n10.1108/shr-04-2019-0024\\n[3] Doris Allhutter, Florian Cech, Fabian Fischer, Gabriel Grill, and Astrid Mager. 2020. Algo-\\nrithmic profiling of Job Seekers in austria: How austerity politics are made effective. Frontiers\\nin Big Data 3 (2020). https://doi.org/10.3389/fdata.2020.00005\\n[4] Mike Ananny and Kate Crawford. 2018. Seeing without knowing: Limitations of the trans-\\nparency ideal and its application to algorithmic accountability. New Media & Society 20, 3\\n(2018), 973–989. https://doi.org/10.1177/1461444816676645\\n[5] Julia Angwin, JeffLarson, Surya Mattu, and Lauren Kirchner. 2022. Machine Bias*. Ethics\\nof Data and Analytics (2022), 254–264. https://doi.org/10.1201/9781003278290-37\\n[6] Jacqui Ayling and Adriane Chapman. 2021. Putting AI ethics to work: are the tools fit for\\npurpose? AI and Ethics (2021), 1–25. https://doi.org/10.1007/s43681-021-00084-x\\n[7] Agathe Balayan and Seda Gürses. 2021. Beyond Debiasing: Regulating AI and Its Inequalities.\\nTechnical Report. Delft University of Technology.\\n[8] Robert Bartlett, Adair Morse, Richard Stanton, and Nancy Wallace. 2019. Consumer-lending\\ndiscrimination in the FinTech era. Technical Report. National Bureau of Economic Research.\\n[9] Richard Berk et al. 2017.\\nA convex framework for fair regression.\\narXiv preprint\\narXiv:1706.02409 (2017).\\n[10] Dimitris Bertsimas, Vivek F. Farias, and Nikolaos Trichakis. 2012. On the efficiency-fairness\\ntrade-off. Management Science 58, 12 (2012), 2234–2250. https://doi.org/10.1287/mnsc.\\n1120.1549\\n[11] Miranda Bogen and Aaron Rieke. 2018. Help Wanted: An Examination of Hiring Algorithms,\\nEquity, and Bias. Report. Upturn.\\n[12] Anu Bradford. 2020. The Brussels effect: How the European Union rules the world. Oxford\\nUniversity Press, USA.\\n[13] Nadia Burkart and Marco F. Huber. 2021.\\nA survey on the explainability of supervised\\nmachine learning. Journal of Artificial Intelligence Research 70 (2021), 245–317.\\nhttps:\\n//doi.org/10.1613/jair.1.12228\\n17\\n[14] Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automat-\\nically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186.\\nhttps://doi.org/10.1126/science.aal4230\\n[15] Longbing Cao. 2022.\\nAI in finance: Challenges, techniques, and opportunities.\\nComput.\\nSurveys 55, 3 (2022), 1–38. https://doi.org/10.1145/3502289\\n[16] Manuel Carabantes. 2020.\\nBlack-box artificial intelligence:\\nan epistemological and crit-\\nical analysis.\\nAI & SOCIETY\\n35, 2 (2020), 309–317.\\nhttps://doi.org/10.1007/\\ns00146-019-00888-w\\n[17] Daniel C Castro, Ian Walker, and Ben Glocker. 2020.\\nCausality matters in medical\\nimaging.\\nNature Communications 11, 1 (2020), 1–10.\\nhttps://doi.org/10.1038/\\ns41467-020-17478-w\\n[18] Kyla Chasalow and Karen Levy. 2021. Representativeness in Statistics, Politics, and Ma-\\nchine Learning. In Proceedings of the 2021 ACM Conference on Fairness, Accountability,\\nand Transparency. Association for Computing Machinery, New York, NY, USA, 77–89.\\nhttps://doi.org/10.1145/3442188.3445872\\n[19] Henriette Cramer, Jean Garcia-Gathright, Sravana Reddy, Aaron Springer, and Romain\\nTakeo Bouyer. 2019. Translation, tracks & data: an algorithmic bias effort in practice. In\\nExtended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems.\\n1–8. https://doi.org/10.1145/3290607.3299057\\n[20] Alexander D’Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, David Sculley, and\\nYoni Halpern. 2020. Fairness is not static: deeper understanding of long term fairness via\\nsimulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and\\nTransparency. 525–534. https://doi.org/10.1145/3351095.3372878\\n[21] Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, and Krishna P\\nGummadi. 2021. When the umpire is also a player: Bias in private label product recommenda-\\ntions on e-commerce marketplaces. In Proceedings of the 2021 ACM Conference on Fairness,\\nAccountability, and Transparency. 873–884. https://doi.org/10.1145/3442188.3445944\\n[22] William Dieterich, Christina Mendoza, and MS Tim Brennan. 2016. COMPAS Risk Scales:\\nDemonstrating Accuracy Equity and Predictive Parity.\\n[23] Tom Dobber, Ronan Ó Fathaigh, and Frederik Zuiderveen Borgesius. 2019. The regulation of\\nonline political micro-targeting in Europe. Internet Policy Review 8, 4 (2019).\\n[24] Cory Doctorow. 2021. How to Destroy ‘Surveillance Capitalism’. Medium Editions.\\n[25] Paul Dourish and Genevieve Bell. 2011. Divining a Digital Future: Mess and Mythology in\\nUbiquitous Computing. MIT Press, Cambridge, Mass.\\n[26] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012.\\nFairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Sci-\\nence Conference on - ITCS ’12. https://doi.org/10.1145/2090236.2090255\\n[27] M. C. Elish and danah boyd. 2017.\\nSituating methods in the magic of Big Data and ai.\\nCommunication Monographs 85, 1 (2017), 57–80.\\nhttps://doi.org/10.1080/03637751.\\n2017.1375130\\n18\\n[28] Robert Epstein. 2019. Why Google Poses a Serious Threat to Democracy, and How to End\\nThat Threat. America Institute for Behavioral Research and Technology (2019).\\n[29] Jessie Finocchiaro, Roland Maio, Faidra Monachou, Gourab K Patro, Manish Raghavan,\\nAna-Andreea Stoica, and Stratis Tsirtsis. 2021. Bridging Machine Learning and mechanism\\ndesign towards Algorithmic Fairness. Proceedings of the 2021 ACM Conference on Fairness,\\nAccountability, and Transparency. https://doi.org/10.1145/3442188.3445912\\n[30] Ben Green and Yiling Chen. 2019. Disparate interactions: An algorithm-in-the-loop analysis of\\nfairness in risk assessments. In Proceedings of the 2019 Conference on Fairness, Accountability,\\nand Transparency. 90–99. https://doi.org/10.1145/3287560.3287563\\n[31] Graham Greenleaf. 2021. The ‘Brussels Effect’ of the EU’s ‘AI Act’ on Data Privacy Outside\\nEurope. , 3-7 pages. https://papers.ssrn.com/abstract=3898904\\n[32] Nina Grgić-Hlača, Elissa M Redmiles, Krishna P Gummadi, and Adrian Weller. 2018. Human\\nperceptions of fairness in algorithmic decision making: A case study of criminal risk prediction.\\nIn Proceedings of the 2018 World Wide Web Conference - WWW ’18. 903–912.\\nhttps:\\n//doi.org/10.1145/3178876.3186138\\n[33] Nina Grgić-Hlača, Muhammad Bilal Zafar, Krishna P Gummadi, and Adrian Weller. 2018.\\nBeyond distributive fairness in algorithmic decision making: Feature selection for procedurally\\nfair learning. In Thirty-Second AAAI Conference on Artificial Intelligence.\\nhttps://ojs.\\naaai.org/index.php/AAAI/article/view/11296\\n[34] Ruocheng Guo, Lu Cheng, Jundong Li, P Richard Hahn, and Huan Liu. 2020. A survey of\\nlearning causality with data: Problems and methods. ACM Computing Surveys (CSUR) 53,\\n4 (2020), 1–37. https://doi.org/10.1145/3397269\\n[35] Abhishek Gupta, Connor Wright, Marianna Bergamaschi Ganapini, Masa Sweidan, and Ren-\\njie Butalid. 2022.\\nState of AI Ethics Report (Volume 6, February 2022).\\narXiv preprint\\narXiv:2202.07435 (2022).\\n[36] Alisha Haridasani Gupta. 2019. Are Algorithms Sexist? The New York Times (2019).\\n[37] Meeri Haataja and Joanna J. Bryson. 2021. What costs should we expect from the EU’s AI\\nAct? SocArXiv. Center for Open Science.\\n[38] Philipp Hacker. 2018. Teaching fairness to artificial intelligence: Existing and novel strategies\\nagainst algorithmic discrimination under EU law. Common Market Law Review 55, 4 (2018),\\n1143–1185. https://doi.org/10.54648/cola2018095\\n[39] Bernard E. Harcourt. 2007.\\nAgainst Prediction: Profiling, Policing, and Punishing in an\\nActuarial Age. University of Chicago Press. viii, 336 pages.\\n[40] Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learn-\\ning. In Advances in Neural Information Processing Systems, Vol. 29. 3315–3323.\\nhttps://\\nproceedings.neurips.cc/paper/2016/file/9d2682367c3935defcb1f9e247a97c0d-Paper.\\npdf\\n[41] John C. Havens. 2018. Creating the human standard for ethical autonomous and intelligent\\nsystems (A/IS). AI Matters 4 (4 2018), 28–31. Issue 1. https://doi.org/10.1145/3203247.\\n3203255\\n19\\n[42] Thomas T Hills. 2019. The Dark Side of Information Proliferation. Perspectives on Psycho-\\nlogical Science 14 (2019), 323–330. https://doi.org/10.1177/1745691618803647\\n[43] Anna Lauren Hoffmann. 2019. Where fairness fails: data, algorithms, and the limits of an-\\ntidiscrimination discourse. 22, 7 (06 2019), 900–915.\\nhttps://doi.org/10.1080/1369118x.\\n2019.1573912\\n[44] Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wal-\\nlach. 2019. Improving fairness in machine learning systems: What do industry practitioners\\nneed?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.\\n1–16. https://doi.org/10.1145/3290605.3300830\\n[45] Brad Hooker. 2014. Utilitarianism and fairness. Cambridge University Press, 280–302.\\n[46] Dietmar Hübner. 2021. Two kinds of discrimination in AI-based penal decision-making. ACM\\nSIGKDD Explorations Newsletter 23, 1 (2021), 4–13. https://doi.org/10.1145/3468507.\\n3468510\\n[47] Javier Camacho Ibáñez and Mónica Villas Olmeda. 2021. Operationalising AI ethics: How\\nare companies bridging the gap between practice and principles? An exploratory study. (08\\n2021).\\nhttps://doi.org/10.1007/s00146-021-01267-0\\n[48] Information Commissioner’s Office (ICO). 2021. Guide to the General Data Protection Reg-\\nulation (GDPR). https://ico.org.uk/for-organisations/guide-to-data-protection/\\nguide-to-the-general-data-protection-regulation-gdpr/\\n[49] Shelly Kagan. 2018. Normative ethics. Routledge.\\n[50] Mamoru Kaneko and Kenjiro Nakamura. 1979. The Nash Social Welfare function. Economet-\\nrica: Journal of the Econometric Society 47, 2 (1979), 423–435. https://doi.org/10.2307/\\n1914191\\n[51] Simon Keller. 2009. Welfarism. Philosophy Compass 4, 1 (2009), 82–95. https://doi.org/\\n10.1111/j.1747-9991.2008.00196.x\\n[52] Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent Trade-Offs in\\nthe Fair Determination of Risk Scores. In 8th Innovations in Theoretical Computer Science\\nConference (ITCS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 1–23.\\nhttps:\\n//doi.org/10.4230/LIPIcs.ITCS.2017.43\\n[53] Ansgar Koene, Adam Leon Smith, Takashi Egawa, Sukanya Mandalh, and Yohko Hatada.\\n2018.\\nIEEE P70xx, Establishing Standards for Ethical Technology.\\nProceedings of KDD,\\nExCeL London UK (8 2018), 1–2.\\n[54] P. M. Krafft, Meg Young, Michael Katell, Karen Huang, and Ghislain Bugingo. 2019. Defining\\nAI in Policy versus Practice.\\nhttps://papers.ssrn.com/abstract=3431304\\n[55] Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fair-\\nness. In Advances in Neural Information Processing Systems, Vol. 30. 4066–4076.\\nhttps://\\nproceedings.neurips.cc/paper/2017/file/a486cd07e4ac3d270571622f4f316ec5-Paper.\\npdf\\n20\\n[56] Anja Lambrecht and Catherine Tucker. 2019. Algorithmic bias? an empirical study of appar-\\nent gender-based discrimination in the display of STEM career ads. Management Science 65,\\n7 (2019), 2966–2981. https://doi.org/10.1287/mnsc.2018.3093\\n[57] Moritz Laurer, Andrea Renda, and Timothy Yeung. 2021. Clarifying the costs for the EU’s\\nAI Act. Technical Report.\\n[58] Richard Layard and Stephen Gllaister. 1994. Cost-benefit analysis. Cambridge University\\nPress, Cambridge, UK.\\n[59] Michelle Seng Ah Lee and Jatinder Singh. 2021. The landscape and gaps in open source\\nfairness toolkits. Proceedings of the 2021 CHI Conference on Human Factors in Computing\\nSystems. https://doi.org/10.1145/3411764.3445261\\n[60] Mark MacCarthy and Kenneth Propp. 2021. Machines learn that Brussels writes the rules:\\nThe EU’s new AI regulation. Brookings, May 4 (2021), 2021.\\n[61] Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020.\\nCo-Designing Checklists to Understand Organizational Challenges and Opportunities around\\nFairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing\\nSystems. 1–14. https://doi.org/10.1145/3313831.3376445\\n[62] Karima Makhlouf, Sami Zhioua, and Catuscia Palamidessi. 2021.\\nOn the applicability of\\nmachine learning fairness notions. ACM SIGKDD Explorations Newsletter 23, 1 (2021), 14–\\n23. https://doi.org/10.1145/3468507.3468511\\n[63] Karl Manheim and Lyric Kaplan. 2019. Artificial intelligence: Risks to privacy and democracy.\\nYale JL & Tech. 21 (2019), 106. https://ssrn.com/abstract=3273016\\n[64] Ian Manners. 2002. Normative Power Europe: A Contradiction in Terms? 40, 2 (06 2002),\\n235–258.\\n[65] Mark McFadden, Kate Jones, Emily Taylor, and Georgia Osborn. 2021. Harmonising Artificial\\nIntelligence: The Role of Standards in the EU AI Regulation. (2021).\\n[66] Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan.\\n2021. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 54, 6, Article\\n115 (jul 2021), 35 pages.\\nhttps://doi-org.ezbusc.usc.gal/10.1145/3457607\\n[67] Milagros Miceli, Martin Schuessler, and Tianling Yang. 2020. Between Subjectivity and Im-\\nposition: Power Dynamics in Data Annotation for Computer Vision. Proceedings of the ACM\\non Human-Computer Interaction 4, CSCW2 (Oct. 2020), 1–25.\\nhttps://doi.org/10.1145/\\n3415186\\n[68] Shira Mitchell, Eric Potash, Solon Barocas, Alexander D’Amour, and Kristian Lum.\\n2021.\\nPrediction-Based Decisions and Fairness:\\nA Catalogue of Choices,\\nAssump-\\ntions,\\nand\\nDefinitions.\\n8,\\n1\\n(03\\n2021),\\n141–163.\\nhttps://doi.org/10.1146/\\nannurev-statistics-042720-125902 arXiv:1811.07867\\n[69] Brent Mittelstadt. 2019. Principles alone cannot guarantee ethical AI. Nature Machine Intel-\\nligence 1, 11 (11 2019), 501–507.\\nhttps://doi.org/10.1038/s42256-019-0114-4\\n21\\n[70] Andrew Moore. 2013.\\nHedonism.\\nStanford University.\\nhttps://plato.stanford.edu/\\nentries/hedonism/\\n[71] Jessica Morley, Anat Elhalal, Francesca Garcia, Libby Kinsey, Jakob Mökander, and Luciano\\nFloridi. 2021. Ethics as a Service: A Pragmatic Operationalisation of AI Ethics. 31, 2 (2021),\\n239–256.\\nhttps://doi.org/10.1007/s11023-021-09563-w\\n[72] Benjamin Mueller. 2021. How Much Will the Artificial Intelligence Act Cost Europe? Tech-\\nnical Report. Center for Data Innovation.\\n[73] Sendhil Mullainathan. 2018. Algorithmic fairness and the social welfare function. In Pro-\\nceedings of the 2018 ACM Conference on Economics and Computation. 1–1.\\nhttps:\\n//doi.org/10.1145/3219166.3219236\\n[74] Catelijne Muller. 2020. The Impact of Artificial Intelligence on Human Rights, Democracy\\nand the Rule of Law. Technical Report. Council of Europe, Strasbourg.\\n[75] Helen Nissenbaum. 2009. Privacy in Context: Technology, Policy, and the Integrity of Social\\nLife. Stanford University Press.\\nhttps://doi.org/10.1515/9780804772891\\n[76] Eirini Ntoutsi et al. 2020. Bias in data-driven artificial intelligence systems—An introductory\\nsurvey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10 (5 2020).\\nIssue 3. https://doi.org/10.1002/widm.1356\\n[77] Council of the European Union. 2000. Council Directive 2000/78/EC of 27 November 2000\\nestablishing a general framework for equal treatment in employment and occupation.\\n[78] Council of the European Union. 2000.\\nCouncil Directive E 2000/43/EC of 29 June 2000\\nimplementing the principle of equal treatment between persons irrespective of racial or ethnic\\norigin.\\n[79] Council of the European Union. 2004. Council Directive 2004/113/EC of 13 December 2004\\nimplementing the principle of equal treatment between men and women in the access to and\\nsupply of goods and services.\\n[80] Commission of the European Union. 2021.\\nCommunication from the Commission to the\\nEuropean Parliament, the Council, the European Economic and Social Committee and the\\nCommittee of the Regions 2013 Digital Compass: the European way for the Digital Decade.\\n[81] Will Orr and Jenny L Davis. 2020.\\nAttributions of ethical responsibility by Artificial In-\\ntelligence practitioners.\\nInformation, Communication & Society 23, 5 (2020), 719–735.\\nhttps://doi.org/10.1080/1369118x.2020.1713842\\n[82] European Parliament and Council. 2007. Charter of Fundamental Rights of the European\\nUnion.\\n[83] European Parliament and Council of the European Union. 2006. Directive 2006/54/EC Of the\\nEuropean Parliament and of the Council of 5 July 2006 on the implementation of the principle\\nof equal opportunities and equal treatment of men and women in matters of employment and\\noccupation.\\n22\\n[84] European Parliament and Council of the European Union. 2016. Regulation (EU) 2016/679\\non the protection of natural persons with regard to the processing of personal data and on\\nthe free movement of such data, and repealing Directive 95/46/EC (General Data Protection\\nRegulation).\\n[85] European Parliament and Council of the European Union. 2021. Proposal for a Regulation\\nof the European Parliament and of the Council laying down harmonised rules on Artificial\\nIntelligence (Artificial Intelligence Act) and amending certain Union legislative acts.\\n[86] Herbert James Paton. 1971. The categorical imperative: A study in Kant’s moral philosophy.\\nVol. 1023. University of Pennsylvania Press.\\n[87] Gourab K Patro, Arpita Biswas, Niloy Ganguly, Krishna P Gummadi, and Abhijnan\\nChakraborty. 2020. FairRec: Two-sided fairness for personalized recommendations in two-\\nsided platforms. In Proceedings of The Web Conference 2020. 1194–1204. https://doi.org/\\n10.1145/3366423.3380196\\n[88] Martin Pawelczyk, Sascha Bielawski, Johannes van den Heuvel, Tobias Richter, and Gjergji\\nKasneci. 2021. CARLA: A Python Library to Benchmark Algorithmic Recourse and Counter-\\nfactual Explanation Algorithms. In Proceedings of the Neural Information Processing Systems\\nTrack on Datasets and Benchmarks 2021.\\n[89] Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. 2020. On Counterfactual Ex-\\nplanations under Predictive Multiplicity. In Proceedings of the Thirty-Sixth Conference on\\nUncertainty in Artificial Intelligence, UAI 2020 (Proceedings of Machine Learning Research,\\nVol. 124). AUAI Press, 809–818.\\n[90] Seeta Peña Gangadharan and Jędrzej Niklas. 2019. Decentering Technology in Discourse on\\nDiscrimination. Information, Communication & Society 22, 7 (June 2019), 882–899.\\nhttps:\\n//doi.org/10.1080/1369118X.2019.1593484\\n[91] Inioluwa Deborah Raji et al. 2020. Closing the AI accountability gap: Defining an end-to-end\\nframework for internal algorithmic auditing. In Proceedings of the 2020 conference on fairness,\\naccountability, and transparency. 33–44. https://doi.org/10.1145/3351095.3372873\\n[92] Bogdana Rakova, Jingying Yang, Henriette Cramer, and Rumman Chowdhury. 2021. Where\\nresponsible AI meets reality: Practitioner perspectives on enablers for shifting organizational\\npractices. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–23.\\nhttps://doi.org/10.1145/3449081\\n[93] John Rawls. 2009. A theory of justice. Harvard University Press, Cambridge, Mass.\\n[94] Richard M. Re and Alicia Solow-Niederman. 2019. Developing artificially intelligent justice.\\nStan. Tech. L. Rev. 22 (2019), 242. https://ssrn.com/abstract=3390854\\n[95] Andrea Renda et al. 2021. Study to support an impact assessment of regulatory requirements\\nfor artificial intelligence in Europe. Technical Report. European Commission - Directorate-\\nGeneral for Communications Networks, Content and Technology.\\n[96] Filipe N. Ribeiro et al. 2018. Media Bias Monitor : Quantifying Biases of Social Media News\\nOutlets at Large-Scale. In Twelfth International AAAI Conference on Web and Social Media.\\nAAAI Press, Palo Alto, California, 290–299.\\nhttps://aaai.org/ocs/index.php/ICWSM/\\nICWSM18/paper/view/17878\\n23\\n[97] Brianna Richardson and Juan E. Gilbert. 2021. A Framework for Fairness: A Systematic\\nReview of Existing Fair AI Solutions. (12 2021). arXiv:2112.05700 http://arxiv.org/abs/\\n2112.05700\\n[98] Boris Ruf and Marcin Detyniecki. 2021. Towards the Right Kind of Fairness in AI. (09 2021).\\narXiv:2102.08453 [cs] http://arxiv.org/abs/2102.08453\\n[99] Ajay Sandhu and Peter Fussey. 2021. The ‘uberization of policing’? How police negotiate\\nand operationalise predictive policing technology. Policing and Society 31, 1 (2021), 66–81.\\nhttps://doi.org/10.1080/10439463.2020.1803315\\n[100] Cristian Santesteban and Shayne Longpre. 2020. How big data confers market power to Big\\nTech: Leveraging the perspective of data science. The Antitrust Bulletin 65, 3 (2020), 459–485.\\nhttps://doi.org/10.1177/0003603x20934212\\n[101] Laura Sartori and Andreas Theodorou. 2022. A sociotechnical perspective for the future of AI:\\nnarratives, inequalities, and human control. Ethics and Information Technology 24, 1 (2022),\\n1–11. https://doi.org/10.1007/s10676-022-09624-3\\n[102] Daniel S. Schiff, Kelly Laas, Justin B. Biddle, and Jason Borenstein. 2022. Global AI Ethics\\nDocuments: What They Reveal About Motivations, Practices, and Policies. In Codes of Ethics\\nand Ethical Guidelines: Emerging Technologies, Changing Fields. Springer International Pub-\\nlishing, 121–143.\\nhttps://doi.org/10.1007/978-3-030-86201-5_7\\n[103] Andrew D. Selbst, danah boyd, Sorelle A. Friedler, Suresh Venkatasubramanian, and Janet\\nVertesi. 2019.\\nFairness and Abstraction in Sociotechnical Systems. In Proceedings of the\\nConference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* ’19).\\nAssociation for Computing Machinery, New York, NY, USA, 59–68.\\nhttps://doi.org/10.\\n1145/3287560.3287598\\n[104] Amartya Sen. 1979. Utilitarianism and welfarism. The journal of Philosophy 76, 9 (1979),\\n463–489. https://doi.org/10.2307/2025934\\n[105] Josh\\nSimons\\nand\\nDipayan\\nGhosh.\\n2022.\\nUtilities\\nfor\\ndemoc-\\nracy:\\nWhy\\nand\\nhow\\nthe\\nAlgorithmic\\nInfrastructure\\nof\\nFacebook\\nand\\nGoogle\\nmust\\nbe\\nregulated.\\nhttps://www.brookings.edu/research/\\nutilities-for-democracy-why-and-how-the-algorithmic-infrastructure-of-facebook-and-google-\\n[106] Taylor Telford. 2019.\\nApple Card algorithm sparks gender bias allegations against\\nGoldman\\nSachs.\\nhttps://www.washingtonpost.com/business/2019/11/11/\\napple-card-algorithm-sparks-gender-bias-allegations-against-goldman-sachs/\\n[107] Ehsan Toreini et al. 2020. The relationship between trust in AI and trustworthy machine\\nlearning technologies. In Proceedings of the 2020 conference on fairness, accountability, and\\ntransparency. 272–283. https://doi.org/10.1145/3351095.3372834\\n[108] Antje Von Ungern-Sternberg et al. 2022.\\nDiscriminatory AI and the Law– Legal stan-\\ndards for algorithmic profiling. In The Cambridge Handbook of Responsible Artificial Intel-\\nligence: Interdisciplinary Perspectives (Cambridge Law Handbooks). Cambridge University\\nPress. https://ssrn.com/abstract=3876657\\n24\\n[109] Michael Veale and Frederik Zuiderveen Borgesius. 2021. Demystifying the Draft EU Artificial\\nIntelligence Act — Analysing the Good, the Bad, and the Unclear Elements of the Proposed\\nApproach. Computer Law Review International 22, 4 (Aug. 2021), 97–112.\\nhttps://doi.\\norg/doi:10.9785/cri-2021-220402\\n[110] Michael Veale, Max Van Kleek, and Reuben Binns. 2018. Fairness and accountability design\\nneeds for algorithmic support in high-stakes public sector decision-making. In Proceedings of\\nthe 2018 CHI Conference on Human Factors in Computing Systems. 1–14.\\nhttps://doi.\\norg/10.1145/3173574.3174014\\n[111] Sahil Verma and Julia Rubin. 2018. Fairness Definitions Explained. In Proceedings of the Inter-\\nnational Workshop on Software Fairness (New York, NY, USA) (FairWare ’18). Association\\nfor Computing Machinery, 1–7.\\nhttps://doi.org/10.1145/3194770.3194776\\n[112] Julius von Kügelgen, Amir-Hossein Karimi, Umang Bhatt, Isabel Valera, Adrian Weller,\\nand Bernhard Schölkopf. 2021.\\nOn the Fairness of Causal Algorithmic Recourse.\\narXiv:2010.06529 [cs, stat] http://arxiv.org/abs/2010.06529\\n[113] Raphaële Xenidis and Linda Senden. 2019. EU non-discrimination law in the era of artificial\\nintelligence: Mapping the challenges of algorithmic discrimination. In Ulf Bernitz et al (eds),\\nGeneral Principles of EU law and the EU Digital Order. Kluwer Law International, 2020,\\n151–182. https://ssrn.com/abstract=3529524\\n[114] Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi.\\n2017. Fairness beyond disparate treatment & disparate impact: Learning classification without\\ndisparate mistreatment. In Proceedings of the 26th international conference on world wide web.\\n1171–1180. https://doi.org/10.1145/3038912.3052660\\n[115] Muhammad Bilal Zafar, Isabel Valera, Manuel Rodriguez, Krishna Gummadi, and Adrian\\nWeller. 2017. From parity to preference-based notions of fairness in classification. In Advances\\nin Neural Information Processing Systems. 229–239.\\n[116] Aleš Završnik. 2020. Criminal justice, artificial intelligence systems, and human rights. In\\nERA Forum, Vol. 20. Springer, 567–583. https://doi.org/10.1007/s12027-020-00602-0\\n[117] Shoshana Zuboff. 2019. Surveillance Capitalism and the Challenge of Collective Action. New\\nLabor Forum 28, 1 (Jan. 2019), 10–29.\\nhttps://doi.org/10.1177/1095796018819461\\n[118] Frederik Zuiderveen Borgesius. 2018. Discrimination, artificial intelligence, and algorithmic\\ndecision-making. Technical Report. Strasbourg Council of Europe.\\n25\\n\\n\\n\\n---\\n\\n\\nConnecting the Dots in Trustworthy Artificial Intelligence: From AI\\nPrinciples, Ethics, and Key Requirements to Responsible AI Systems\\nand Regulation\\nNatalia Díaz-Rodrígueza,∗, Javier Del Serb,c,∗, Mark Coeckelberghd, Marcos López de Pradoe,f,g,\\nEnrique Herrera-Viedmaa and Francisco Herreraa\\naDepartment of Computer Science and Artificial Intelligence, DaSCI Andalusian Institute in Data Science and Computational Intelligence, University of\\nGranada, Granada 18071, Spain\\nbTECNALIA, Basque Research and Technology Alliance (BRTA), 48160 Derio, Spain\\ncDepartment of Communications Engineering, University of the Basque Country (UPV/EHU), 48013 Bilbao, Spain\\ndDepartment of Philosophy, University of Vienna, Vienna, 1010, Austria\\neSchool of Engineering, Cornell University, Ithaca, NY, 14850, United States\\nfADIA Lab, Al Maryah Island, Abu Dhabi, United Arab Emirates\\ngDepartment of Mathematics, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates\\nA R T I C L E I N F O\\nKeywords:\\nTrustworthy AI\\nAI Ethics\\nResponsible AI systems\\nAI Regulation\\nRegulatory Sandbox\\nA B S T R A C T\\nTrustworthy Artificial Intelligence (AI) is based on seven technical requirements sustained over three\\nmain pillars that should be met throughout the system’s entire life cycle: it should be (1) lawful,\\n(2) ethical, and (3) robust, both from a technical and a social perspective. However, attaining truly\\ntrustworthy AI concerns a wider vision that comprises the trustworthiness of all processes and actors\\nthat are part of the system’s life cycle, and considers previous aspects from different lenses. A more\\nholistic vision contemplates four essential axes: the global principles for ethical use and development\\nof AI-based systems, a philosophical take on AI ethics, a risk-based approach to AI regulation, and the\\nmentioned pillars and requirements. The seven requirements (human agency and oversight; robustness\\nand safety; privacy and data governance; transparency; diversity, non-discrimination and fairness;\\nsocietal and environmental wellbeing; and accountability) are analyzed from a triple perspective:\\nWhat each requirement for trustworthy AI is, Why it is needed, and How each requirement can be\\nimplemented in practice. On the other hand, a practical approach to implement trustworthy AI systems\\nallows defining the concept of responsibility of AI-based systems facing the law, through a given\\nauditing process. Therefore, a responsible AI system is the resulting notion we introduce in this work,\\nand a concept of utmost necessity that can be realized through auditing processes, subject to the\\nchallenges posed by the use of regulatory sandboxes. Our multidisciplinary vision of trustworthy AI\\nculminates in a debate on the diverging views published lately about the future of AI. Our reflections\\nin this matter conclude that regulation is a key for reaching a consensus among these views, and that\\ntrustworthy and responsible AI systems will be crucial for the present and future of our society.\\n1. Introduction\\nWe are witnessing an unprecedented upsurge of Artifi-\\ncial Intelligence (AI) systems. Despite its important histori-\\ncal development, in the last years AI has vigorously entered\\nall professional and social domains of applications, from\\nautomation to healthcare, education and beyond. Recently, a\\nfamily of generative AI (DALL-E1 [1], Imagen2 [2] or large\\nlanguage model products such as ChatGPT3) have sparked\\na significant amount of debates. These arise as a concern on\\nwhat this could mean in all fields of application and what\\nimpact they could have.\\nThe views expressed in this article are the authors’, and are not\\nnecessarily the views of the institutions they are affiliated with.\\n∗These authors contributed equally. Corresponding authors: natalia-\\ndiaz@ugr.es (N. Díaz-Rodríguez), javier.delser@tecnalia.com (J. Del Ser).\\n1DALL-E, https://openai.com/product/dall-e-2, accessed on April\\n25th, 2023.\\n2Google Imagen, https://imagen.research.google/, accessed on April\\n25th, 2023.\\n3Chat Generative Pre-Trained Transformer (Chat GPT), https://chat\\n.openai.com/chat, accessed on April 25th, 2023.\\nThese debates, active for years now, pose questions re-\\ngarding the ethical aspects and requirements that AI systems\\nmust comply with. They emerge from the ethical principles’\\nperspective, from the regulation ones, from what it means to\\nhave fair AI, or from the technological point of view, on what\\nan ethical development and use of AI systems really mean.\\nThe notion of trustworthy AI has attracted particular interest\\nacross the political institutions of the European Union (EU).\\nThe EU has intensively worked on elaborating this concept\\nthrough a set of guidelines based on ethical principles and\\nrequirements for trustworthy AI [3].\\nTrustworthy AI is a holistic and systemic approach that\\nacts as prerequisite for people and societies to develop,\\ndeploy and use AI systems [3]. It is composed of three pillars\\nand seven requirements: the legal, ethical, and technical\\nrobustness pillars; and the following requirements: human\\nagency and oversight; technical robustness and safety; pri-\\nvacy and data governance; transparency; diversity, non-\\ndiscrimination and fairness; societal and environmental\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 1 of 30\\narXiv:2305.02231v2 [cs.CY] 12 Jun 2023\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nFrom Trustworthy AI to \\nResponsible AI Systems\\nPrinciples for ethical use \\nand development of AI \\n(Section 2)\\nArtificial Intelligence regulation: \\nA risk-based approach\\n(Section 4)\\nA philosophical approach \\nto AI ethics\\n(Section 3)\\nPillars and \\nRequirements of \\nTrustworthy AI\\n(Section 5)\\nHuman \\nagency & \\noversight\\nTechnical \\nrobustness & \\nsafety\\nPrivacy & data \\ngovernance\\nTransparency\\nDiversity, non-\\ndiscrimination \\n& fairness\\nSocietal & \\nenvironmental \\nwellbeing\\nAccountability\\nRobustness\\nLawfulness\\nEthics\\nFigure 1: Our holistic approach to attain responsible AI systems from trustworthy AI breaks down trustworthy AI into 4 critical\\naxes: assuring the principles for ethical development and use of AI, philosophically reflecting on AI ethics, complying with AI\\nregulation, and fulfilling Trustworthy AI requirements.\\nwellbeing; and accountability. Although the previous def-\\ninition is based on requirements, there is a larger multidi-\\nmensional vision. It considers the ethical debate per se, the\\nethical principles and a risk-based approach to regulation,\\nbacked up by the EU AI Act [4].\\nThe goal of this paper is to become a primer for re-\\nsearchers and practitioners interested in a holistic vision of\\ntrustworthy AI from 4 axes (Fig. 1): from ethical principles\\nand AI ethics, to legislation and technical requirements. Ac-\\ncording to this vision, our analysis tackles the main aspects\\nof trustworthy AI in a non-exhaustive but technical fashion,\\nby:\\n• Providing a holistic vision of the multifaceted notion of\\ntrustworthy AI that considers its diverse principles for eth-\\nical use and development of AI, seen from international\\nagencies, governments and the industry.\\n• Breaking down this multidimensional vision of trustwor-\\nthy AI into 4 axes, to reveal the intricacies associated to\\nits pillars, its technical and legal requirements, and what\\nresponsibility in this context really means.\\n• Examining requirements for trustworthy AI, addressing\\nwhat each requirement actually means, why it is necessary\\nand proposed, and how it is being addressed technologi-\\ncally. While this paper is not intended to be an exhaustive\\nreview, we will delve into an overview of technical possi-\\nbilities to address the aforementioned seven key require-\\nments for trustworthy AI.\\n• Analyzing AI regulation from a pragmatic perspective to\\nunderstand the essentials of the most advanced legal piece\\nexisting so far, the European Commission perspective,\\nand to fully grasp its practical applicability.\\n• Defining responsible AI systems as the result of connect-\\ning the many-sided aspects of trustworthy AI above. This\\nis the notion we advocate for, in order to truly attain trust-\\nworthy AI. Their design should be guided by regulatory\\nsandboxes.\\n• Dissecting currently hot debates on the status of AI,\\nthe moratorium letter to pause giant AI experiments, the\\ncurrent movements around an international regulation and\\nour positioning based on the previous analyses.\\nBy bridging the gap from theory (AI Principles, Ethics,\\nand Key Requirements) to practice (Responsible AI Systems\\nand Regulation), our holistic view offered in this work aims\\nto ultimately highlight the importance of all these elements\\nin the development and integration of human-centered AI-\\nbased systems into the everyday life of humans, in a natural\\nand sustainable way.\\nThe paper is organized as follows: Section 2 revises the\\nmost widely recognized AI principles for the ethical use\\nand development of AI (axis 1). Section 3 considers axis\\n2: a philosophical approach to AI ethics. Section 4 (axis 3)\\npresents the current risk-based viewpoint to AI regulation.\\nSection 5 analyzes axis 4, i.e., key requirements to imple-\\nment trustworthy AI from a technical point of view. Section 6\\ndiscusses the practical applicability of trustworthy AI by first\\nclosing the loop through the necessary definition of respon-\\nsible AI systems, and then exposing the requirements for\\nhigh-risk AI systems to comply with the law. It also proposes\\nthe use of regulatory sandboxes as a challenge to test the\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 2 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nformer requirements via auditing, and a practical blueprint\\nas a case study for AI healthcare. We end this section by\\nelaborating on the needs of emerging AI systems (including\\ngeneral purpose AI and neuroscience technology), which de-\\nmand evolved evaluation protocols and dynamic regulation.\\nSection 7 follows by dissecting currently hot debates on the\\nstatus of AI, from the AI moratorium letter to regulation\\nas the key for consensus, including a reflection on the gap\\nto be closed between regulation and technological progress.\\nFinally, Section 8 draws concluding remarks, and highlights\\nthe aforementioned convergence between AI technology and\\nregulation as the beacon for research efforts that safely bring\\nthe benefits of this technology to humanity.\\n2. Principles for ethical use and development\\nof Artificial Intelligence\\nA large set of declarations and guidelines for the ethical\\nuse and development of AI has bloomed. These declarations\\nlead to different similar approaches for introducing sets of\\nprinciples as a departure point for discussing about the\\nresponsible development of AI.\\nIn this section we will analyze three different principle\\ndeclarations. We will start in Section 2.1 with the general\\nUNESCO Recommendation on the Ethics of Artificial In-\\ntelligence, and continue in Section 2.2 taking a practical\\nperspective from the industry point of view through the prin-\\nciples of Responsible AI by Design in Practice by Telefónica,\\na global telecommunication and media company with large\\npresence in the Spanish and Portuguese markets. Finally,\\nin Section 2.3 we analyze the ethical principles based on\\nfundamental rights associated to the European approach.\\n2.1. UNESCO recommendation on the ethics of\\nArtificial Intelligence\\nIn November 2021, UNESCO proposed in Paris a Rec-\\nommendation on the Ethics of AI. Recognizing that AI can\\nbe of great service to humanity and all countries can benefit\\nfrom it, but also can raise fundamental ethical concerns (can\\ndeepen existing divides and inequities in the world), and\\naccounting for the Universal Declaration of Human Rights\\n(1948) and the rest of relevant international recommenda-\\ntions and declarations, the UNESCO Recommendation on\\nthe Ethics of Artificial Intelligence, [5] are the following:\\n1. Proportionality and do no harm: AI methods should\\nnot infringe upon the foundational values in these rec-\\nommendations, should be based on rigorous scientific\\nfoundations, and final human determination should apply.\\n2. Safety and security: Unwanted harms such as safety risks,\\nand vulnerabilities to attacks (security risks) should be\\navoided and addressed throughout the life cycle of AI\\nsystems.\\n3. Fairness and non-discrimination: AI actors should pro-\\nmote social justice and safeguard fairness. Member States\\nshould tackle digital divides ensuring inclusive access\\nand equity, and participation in the development of AI.\\n4. Sustainability: The continuous assessment of the human,\\nsocial, cultural, economic and environmental impact of\\nAI technology should be carried out with “full cog-\\nnizance of the implications of AI for sustainability” as\\na set of constantly evolving goals.\\n5. Right to Privacy, and Data Protection: Privacy must be\\nrespected, protected and promoted throughout the AI life\\ncycle.\\n6. Human oversight and determination: Member States\\nshould ensure that it is always possible to attribute ethical\\nand legal responsibility for any stage of the life cycle of\\nAI systems, as well as in cases of remedy related to AI\\nsystems, to physical persons or existing legal entities.\\n7. Transparency and explainability: Transparency is nec-\\nessary for relevant liability regimes to work effectively.\\nAI actors should commit to ensuring that the algorithms\\ndeveloped are explainable, especially in cases that im-\\npact the end user in a way that is not temporary, easily\\nreversible or otherwise low risk.\\n8. Responsibility and accountability: “The ethical responsi-\\nbility and liability for the decisions and auctions based\\nin any way on an AI system should always ultimately be\\nattributable to AI actors”\\n9. Awareness and literacy: Public awareness and under-\\nstanding of AI technologies and the value of data should\\nbe promoted through open and accessible education, civic\\nengagement, digital skills and AI ethics training. All\\nsociety should be able to take informed decisions about\\ntheir use of AI systems and be protected from undue\\ninfluence.\\n10. Multi-stakeholder and adaptive governance and collabo-\\nration: “Participation of different stakeholders through-\\nout the AI system life cycle is necessary for inclusive\\napproaches to AI governance, enabling the benefit to be\\nshared by all, and to contribute to sustainable develop-\\nment”.\\nThe proposed principles are accompanied by values to\\npromote, e.g., human rights and fundamental freedoms. Val-\\nues and principles are designed to be respected by all actors\\ninvolved in the AI system life cycle, being amenable of\\nchange through amendments to existing and new legislation\\nand business guidelines, since they must comply with inter-\\nnational law, the United Nations Charter and Member States.\\n2.2. Telefónica’s principles of Responsible AI by\\nDesign in Practice\\nEnterprises also need to cope with and adapt to new\\ndemands of AI products and associated risks. The previous\\nrecommendations are also aligned with the more generic\\nprinciples for AI defined by the Berkman Klein Center\\nfor Internet & Society at Harvard University that started\\nbeing developed since 2016: Principled AI maps ethical and\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 3 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nrights-based approaches to principles for AI to address is-\\nsues related to the potential threats of AI to both individuals\\nand society as a whole. Derived from these, in industry, e.g.,\\nTelefónica defines the so-called 5 principles of Responsible\\nAI by Design in Practice [6] as:\\n1. Fair AI: the output of AI systems must not lead to\\ndiscrimination.\\n2. Transparent and explainable AI: people should know\\nwhether they are communicating with a person or an AI-\\nbased system.\\n3. Human-centered AI (AI for Social Good, Human-centered\\nAI [7]): AI products and services must be aligned with the\\nUN Sustainable Development Goals.\\n4. Privacy and security by design: standards should be con-\\nsidered during all life cycles, also from the Responsible\\nResearch and Innovation Guidelines [8].\\n5. Extend them to any third party.\\nThe adoption of these and similar principles is part of\\nnew awareness strategies being carried out in companies,\\nsometimes known as change management. Telefónica’s ap-\\nproach is only one example of such adoption. This implies\\na change in organizations culture to take into account and\\nimplement these principles on a day-to-day basis.\\n2.3. Ethical principles based on fundamental\\nrights\\nIn Europe, the foundations of trustworthy AI adhere\\nto the four ethical principles proposed by the European\\nCommission’s High-Level Expert Group (HLEG) [3]. These\\nare based on fundamental rights, to which AI practitioners\\nshould always strive to adhere, in order to ensure the devel-\\nopment, deployment and use of AI systems in a trustworthy\\nway. Trustworthy AI is grounded in fundamental rights and\\nreflected by the European Commission’s Ethical Principles:\\n1. Respect for human autonomy: Ensure freedom and au-\\ntonomy of humans interacting with AI systems implies\\nhumans should keep full and effective self-determination\\nover themselves and the ability to take part on democratic\\nprocesses; AI systems should not \\\"unjustifiably subor-\\ndinate, coerce, deceive, manipulate, condition or herd\\nhumans, but rather, argument, complement and empower\\nhuman cognitive, social and cultural skills, leave oppor-\\ntunity for human choice and securing human oversight\\nover work processes\\\" in AI systems, e.g., support humans\\nin the work environment and support the creation of\\nmeaningful work.\\n2. Prevention of harm4: AI systems should not “cause nor\\nexacerbate harm or adversely affect humans”. AI sys-\\ntems should “protect human dignity, mental and physical\\nintegrity, be technically robust and assure they are not\\n4Harm can be individual or collective, can include intangible harm to\\nsocial, cultural, political or natural environments and all living beings.\\nopen to malicious use”. For instance, they should be\\nsupervised so they do not exacerbate adverse impacts due\\nto information asymmetries or unequal balance of power.\\n3. Fairness: Fairness is closely related to the rights to Non-\\ndiscrimination, Solidarity and Justice. Although there are\\nmany different interpretations of fairness, the European\\nCommission advocates for having both: a) a substantive\\ndimension of fairness that \\\"commits to ensure equal and\\njust distribution of benefits and costs, commits to free\\nfrom unfair bias, discrimination and stigmatization, im-\\nplies respecting the principle of proportionality between\\nmeans and ends and a careful balancing of competing\\ninterests and objectives\\\" [3]. b) a procedural dimension\\nallowing to \\\"contest and seek redress against decisions\\ntaken by AI systems or who operates them\\\". To achieve\\nthis, the entity responsible for the decision must be iden-\\ntifiable, while decision making processes should be ex-\\nplainable.\\n4. Explainability: Being crucial for building and maintain-\\ning users trust in the AI system, explainability means\\nthat processes need to be transparent, the capabilities\\nand purpose of AI systems openly communicated, and\\ndecision -to the extent possible- explainable to those\\ndirectly and indirectly affected. When a decision cannot\\nbe duly contested (e.g., because explaining a particular\\nmodel output or decision and what combination of input\\nfactors contributed to it is not always possible), then other\\nexplainability measures may be required (traceability,\\nauditability and transparent communication on the capa-\\nbilities of the AI system). This will depend on the context\\nand severity of consequences if an output is erroneous.\\nThese ethical principles are placed in the context of AI\\nsystems. They act as ethical imperatives, and advocate for\\nAI systems to strive to improve individual and collective\\nwellbeing.\\nAs we can see, the mobilization has been worldwide:\\nfrom the Montréal Declaration for a responsible develop-\\nment of AI – an initiative of University of Montréal–, to the\\nEthics of AI recommendations led by international organ-\\nisations such as UNESCO, passing by the adoption led by\\nindustry. All sets of principles share terminology, common\\ngrounds on human rights, and agree on the relevance of\\npreserving human decisions and responsibilities, which are\\nthe most prominent features of ethics of AI.\\n3. A philosophical approach to Artificial\\nIntelligence ethics\\nEthics is an academic discipline which is a subfield of\\nphilosophy and generally deals with questions such as “What\\nis a good action?”, “What is the value of a human life?”,\\n“What is justice?”, or “What is the good life?” [3].\\nAligned with the European Commission ethics guide-\\nlines [3], our ethical vision of AI consists of five main actions\\n[9]. These can help smooth the way to attain ethical AI.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 4 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nNext, we develop these, taking a philosophical approach to\\nAI ethics:\\n1. Use philosophy and science to examine and critically\\ndiscuss assumptions around the role that AI and humans\\nplay in these scenarios and discussions. For example, one\\ncould critically discuss claims that are made about the\\npossibility of Artificial General Intelligence or human-\\nlevel AI. Large language models, for instance, may give\\nthe impression that they have a human-like level of intel-\\nligence, but work very differently than the human brain\\nand make many mistakes that humans would not make.\\nThis also leads to the question regarding the differences\\nbetween humans and machine, and is also linked to the\\nquestion concerning the moral status of AI. For example,\\nit has been claimed that a chatbot was sentient, while it\\ndid not meet the criteria for sentience. That being said, it\\nis not always clear what these criteria are. AI makes us\\nre-visit philosophical questions concerning moral status.\\n2. Observe attentively the nature of AI and which functions\\nit is assigned to perform today within its diversity of\\napplications. We should recognize the pervasiveness of\\nAI. One reason why it is important to ask ethical ques-\\ntions about AI is that it is pervasive: it is used in many\\napplications such as search, text generation, recommen-\\ndations for commercial products, and so on. In the ethical\\nanalysis, we need to pay attention to the details of each\\napplication\\n3. Discuss the most concrete and pressing ethical and social\\nproblems that AI presents in terms of how it is being\\napplied today. AI raises a number of ethical questions\\nsuch as privacy and data protection, safety, responsibility,\\nand explainability. For example, a chatbot can encour-\\nage someone to take their life. Does this mean that the\\napplication is unsafe? How can we deal with this risk?\\nAnd if something happens, who is responsible? Typically,\\nthere are many people involved in technological action.\\nIt is also important to be answerable to those who are\\naffected by the technology [10], for example in the case\\nof a suicide5 the company may need to be answerable to\\nthe family of the victim. Furthermore, it is important that\\nwhen AI offers recommendations for decisions, it is clear\\non what basis these recommendations and decisions are\\ntaken. One problem is that this is usually not clear in the\\ncase of deep learning. In addition, there are societal im-\\nplications such as potential unemployment caused by the\\nautomation that is enabled by AI, and the environmental\\ncosts of AI and its infrastructures through energy use and\\ncarbon emissions linked to the use of the algorithms, the\\nstorage of data, and the production of hardware.\\n4. Investigate AI policies for the near future. There are\\nnow already many policy documents on AI, for example\\nthe Ethics Guidelines for Trustworthy AI produced by\\nthe European Commission’s High-Level Expert Group\\n5https://coeckelbergh.medium.com/chatbots-can-kill-d82fde5cf6ca\\non AI [3] and the Recommendation on the Ethics of\\nArtificial Intelligence [5]. These documents need to be\\ncritically investigated. For example, in the beginning, less\\nattention was given to environmental consequences of AI.\\nA more general problem is that principles and lists of\\nethical considerations are not sufficient; there is still a\\ngap between those principles and implementation in the\\ntechnology, in standards, and in legal regulation.\\n5. Ask ourselves whether the attention that concentrates\\nthe public discourse in AI is useful as we face other\\nproblems, and whether AI should be our unique focus of\\nattention. Given that we also face other global problems\\nsuch as climate change and poverty, it is important to\\nask the question regarding prioritization: Is AI the most\\nimportant problem we face? And if not - if, for instance,\\nwe insist on climate change being the main and most\\nurgent global problem - how does AI impact and perhaps\\ncontribute to this problem, and how can it perhaps help to\\nsolve it? Reflection on these challenges will be important\\nin the coming years.\\nOnce expressed the ethics of AI from the philosophical\\nperspective, the next section will analyze the regulation of\\nAI.\\n4. Artificial Intelligence regulation: A\\nrisk-based approach\\nIn the currently hot debate of AI, a fundamental aspect\\nis regulating AI for it to be righteous. The most advanced\\nregulation to date is the European Commission’s AI Act\\nproposal6 for the regulation of AI [4].\\nIn this section we are presenting AI regulation from two\\nangles; first in Section 4.1 from the perspective of risk of AI\\nsystems and then, in Section 4.2, we make a deeper analysis\\ninto high-risk AI systems.\\n4.1. A risk-based approach to regulate the use of\\nArtificial Intelligence systems\\nThe AI Act draft proposal for a Regulation of the Euro-\\npean Parliament and of the Council laying down harmonized\\nrules on AI [4] is the first attempt to enact a horizontal\\nAI regulation. The proposed legal framework focuses on\\nthe specific use of AI systems. The European Commission\\nproposes to establish a technology-neutral definition of AI\\nsystems in EU legislation and defines a classification for AI\\nsystems with different requirements and obligations tailored\\nto a “risk-based approach”, where the obligations for an AI\\nsystem are proportionate to the level of risk that it poses.\\nThe rules of the AI Act specifically consider the risks\\ncreated by AI applications by proposing a list of high-risk\\napplications, setting clear requirements for AI systems for\\nhigh-risk applications, defining specific obligations for AI\\n6On April 27th, 2023, the Members of European Parliament (MEPs)\\nreached a political agreement on the AI Act, https://www.euractiv.com/sec\\ntion/artificial-intelligence/news/meps-seal-the-deal-on-artificial-i\\nntelligence-act/, accessed on May 1st, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 5 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nLevel 1: \\nApplications with\\nunacceptable risk\\nLevel 2: \\nApplications\\nwith high risk\\nLevel 3: \\nApplications\\nwith limited risk\\nLevel 4: \\nApplications\\nwithout risk or\\nwith minimal risk\\nVideogames\\nPurchase\\nrecommendation\\nIndustrial \\nprognosis\\nChatbots\\nDeepfakes\\nEmotion\\nrecognition\\nAutonomous\\nVehicle Safety\\nLaw\\nenforcement\\nSocial & education\\napproval procedures\\nMedical \\ndiagnosis\\nFacial recognition\\nSocial scoring\\nSubliminal\\n/obscure manipulation\\nFigure 2: AI Act criticality pyramid and risk-based approach regulatory system for the use of algorithmic systems; SS stands for\\nsubsequent articles (figure extended from the EU Portal8 and inspired from [11] and [12]).\\nusers and providers of high risk applications, proposing\\na conformity assessment before the AI system is put into\\nservice or placed on the market, proposing enforcement\\nafter it is placed in the market, and proposing a governance\\nstructure at European and national levels.\\nThe four levels of risk [4] outlined by the AI Act are the\\nfollowing (Figure 2):\\n• Minimal or No risk: The vast majority of AI systems\\ncurrently used in the EU fall into this category. The\\nproposal allows the free use of minimal-risk AI. Volun-\\ntarily, systems providers of those systems may choose to\\napply the requirements for trustworthy AI and adhere to\\nvoluntary codes of conduct (Art. 69 - Codes of Conduct)7.\\nWhen a compliant AI systems presents a risk, the relevant\\noperator will be required to take measures to ensure the\\nsystem no longer presents that risk, withdraw the system\\nfrom market, or recall the risk for a reasonable period\\ncommensurate with the nature of the risk (Art. 67 -\\nCompliant AI systems which present a risk). For instance:\\nAI-enabled video games or spam filters.\\n• Limited risk: Systems such that users should be aware\\nthat they are interacting with a machine so they can take\\nan informed decision to continue or step back. These\\nhave to comply with specific information/transparency\\nobligations; for instance, chatbots, and systems generating\\ndeepfakes or synthetic content.\\n7Codes of conduct are encouraged by the Commission and the Member\\nStates to foster the voluntary application to AI systems other than high-risk\\nAI systems (HRAIs) “on the basis of technical specification and solutions\\nthat are appropriate means of ensuring compliance with such requirements\\nin light of the intended purpose of the systems” (Art. 69).\\n8Regulatory framework proposal on Artificial Intelligence, https:\\n//digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai,\\naccessed on April 25th, 2023.\\n• High-risk AI systems (HRAIs): Systems that can have a\\nsignificant impact on the life chances of a user (Art. 6);\\nthey create an adverse impact on people’s safety or their\\nfundamental rights9. Eight types of systems fall into this\\ncategory; these are subject to stringent obligations and\\nmust undergo conformity assessments before being put on\\nthe European market, e.g. systems for law enforcement or\\naccess to education. They will always be high-risk when\\nsubject to third-party conformity assessment under that\\nsectorial legislation.\\n• Unacceptable risk: AI systems considered a clear threat\\nto the safety, livelihoods and rights of people will be pro-\\nhibited in the EU market (Title II — Prohibited Artificial\\nIntelligence Practices, Art. 5). For example: Social scor-\\ning, facial recognition, dark-patterns and manipulative AI\\nsystems, e.g., voice assistance systems that encourage\\ndangerous behavior, or real time remote biometric iden-\\ntification systems in public spaces for law enforcement.\\nAs we can see, very differently to the Chinese, government-\\ncentric, approach, or the US industry-owned-data approach\\nto AI, the EU is taking a human-centric approach to regulate\\nthe use of AI. This risk scenario-based approach regulates\\nusages rather than models and technology themselves. This\\nis the position we defend.\\nSince the beginning of 2023, the European Parliament\\nhas been considering amendments to the law proposing how\\nto conduct fundamental rights impact assessments and other\\nobligations for users of HRAIs. Issues still to be finalized\\ninclude closing the list of HRAI scenarios above exposed,\\n9As protected by the EU Charter of Fundamental Rights (source: https:\\n//ec.europa.eu/commission/presscorner/detail/en/QANDA_21_1683).\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 6 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nprohibited practices, and details concerning the use of copy-\\nrighted content to train AI systems10 and the regulation of\\ngeneral purpose AI systems (GPAIS)11. The first consider-\\nation indicates the EU Parliament will force AI operators\\nto reveal whether they use protected content. The second\\nrequest emerges from the Privacy and Access Council of\\nCanada, who agrees that GPAIS carry serious risks and\\nharmful unintended consequences, and must not be exempt\\nunder the EU AI Act, or equivalent legislation elsewhere.\\nA recent definition of GPAIS can be found in [13]: “An AI\\nsystem that can accomplish a range of distinct valuable tasks,\\nincluding some for which it was not specifically trained”.\\nIt has also been referred to as foundation model [14, JRC\\nGlossary, pag. 32]), but really a GPAIS refers to a model\\nof different nature, beyond the generative AI or foundation\\nmodels that can be considered as specific cases of GPAI\\nsystems. A point of agreement among all definitions to date\\nis the capability of a GPAIS to accomplish tasks beyond\\nthose for which it was originally trained. This is one of the\\nmain reasons why GPAIS have become a pivotal topic of\\ndebate in what refers to AI regulation. Section 6.5 will delve\\nfurther into this.\\n4.2. High-risk Artificial Intelligence systems\\nThe European AI Act is predicted to become the global\\nstandard for AI regulation12 by unifying within a single\\nframework the concept of risk acceptability and the trust-\\nworthiness of AI systems by their users [15]. The risk-based\\napproach of the AI Act specifically categorizes as HRAIs\\nthe following eight kind of AI systems (AI Act, Annex III -\\nHigh-risk AI systems referred to in Art. 6(2) [4]):\\n1. Surveillance systems (e.g., biometric identification and\\nfacial recognition systems for law enforcement)\\n2. Systems intended for use as security components in the\\nmanagement and operation of critical digital infrastruc-\\ntures (road traffic and water, gas, heat and electricity\\nsupply).\\n3. Systems to determine access, admission or assignment\\nof people to educational institutions or programs or to\\nevaluate people (for the purpose of evaluating learning\\noutcomes, learning processes or educational programs).\\n4. Systems intended to be used for recruitment or selection\\nof personnel, screening or filtering of applications and\\nevaluation of candidates, or systems for making decisions\\n10Financial Times, European parliament prepares tough measures over\\nuse of AI, https://www.ft.com/content/addb5a77-9ad0-4fea-8ffb-8e2ae250a\\n95a?shareType=nongift, accessed on April 25th, 2023.\\n11Privacy and Access Council of Canada, Five considerations to guide\\nthe regulation of “General Purpose AI”, https://pacc-ccap.ca/five-consi\\nderations-to-guide-the-regulation-of-general-purpose-ai/, accessed on\\nApril 25th, 2023.\\n12The EU AI Act’s Risk-Based Approach: High-Risk Systems and What\\nThey Mean for Users, https://futurium.ec.europa.eu/en/european-ai-all\\niance/document/eu-ai-acts-risk-based-approach-high-risk-systems-and\\n-what-they-mean-users, accessed on April 25th, 2023.\\non promotion and termination of contractual relation-\\nships, assignment of tasks based on individual behavior\\nand the evaluation of performance and behavior.\\n5. Systems for assessing the eligibility for public benefits\\nor assistance, assessing creditworthiness or establish-\\ning credit scores. Systems for dispatching or prioritizing\\nemergency first response services (firefighters, medical\\nfirst aid, etc.).\\n6. Systems to assess the risk of a person committing crime\\nor recidivism, or the risk that he or she is a potential\\noffender.\\n• Systems intended for use as polygraphs or to detect\\nemotional state, or to assess the reliability of evidence\\nin the course of an investigation or prosecution of\\ncrime.\\n• Systems for predicting the occurrence or re-occurrence\\nof crimes based on profiles of people or assessment of\\npersonality traits and characteristics or past criminal\\nbehavior.\\n• Systems for profiling individuals by law enforcement\\nauthorities in the course of detecting, investigating or\\nprosecuting crimes.\\n7. Systems intended for use by competent public authorities\\n(such as polygraphs or to detect the emotional state of\\nindividuals):\\n• Risk assessment systems, including security risks, ir-\\nregular immigration or health risk posed by a person\\nseeking to enter a member state.\\n• Systems for the examination of applications for asy-\\nlum, visas and residence permits and claims associated\\nwith the eligibility of people applying for status.\\n8. Systems intended for the administration of justice and\\ndemocratic processes (intended to act on behalf of the\\nauthorities in the administration of justice for the inter-\\npretation of acts or law and the application of the law\\nto a particular set of facts, or evaluation of reliability of\\nevidence).\\nOne fact worth noting in the AI Act is its special em-\\nphasis on the importance of taking into account, when\\nclassifying AI systems, the result of the AI system in relation\\nwith the decision or action taken by a human, as well as the\\nimmediacy of its effect (AI Act Intro, (32) [4]).\\n5. Trustworthy Artificial Intelligence: Pillars\\nand Requirements\\nIn a technical sense, trustworthiness is the confidence of\\nwhether a system/model will act as intended when facing a\\ngiven problem [16]. This confidence generates trust in the\\nuser of the model (the audience), which can be supported\\nfrom multiple perspectives. For instance, trust can be fos-\\ntered when a system provides detailed explanations of its\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 7 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\ndecisions [17]. As Lipton puts it, a person can be more\\nconfident when using a model if he/she understands how it\\nworks and how it produces its decisions [18]. Likewise, trust\\ncan be bolstered if the user is offered guarantees that the\\nmodel can operate robustly under different circumstances,\\nthat it respects privacy, or that it does not get affected by\\nbiases present in the data from which it learns.\\nTrustworthiness is, therefore, a multifaceted requisite for\\npeople and societies to develop, deploy and use AI systems,\\nand a sine qua non condition for the realization of the\\npotentially vast social and economic benefits AI can bring\\n[3]. Moreover, trustworthy does not concern only the system\\nitself, but also other actors and processes that take their part\\nduring the AI life cycle. This requires a holistic and systemic\\nanalysis of the pillars and requirements that contribute to the\\ngeneration of trust in the user of an AI-based system.\\nThis section addresses this need by first dissecting the\\nthree pillars that set the basis for trustworthy AI – namely,\\nlawfulness, ethics and robustness (Subsection 5.1) – fol-\\nlowed by a thorough analysis of the seven requirements\\nproposed by the European Commission’s High-Level Ex-\\npert Group (HLEG): human agency and oversight (Sub-\\nsection 5.2); technical robustness and safety (Subsection\\n5.3); privacy and data governance (Subsection 5.4); Trans-\\nparency (Subsection 5.5); diversity, non-discrimination and\\nfairness (Subsection 5.6); societal and environmental well-\\nbeing (Subsection 5.7); and accountability (Subsection 5.8).\\nDefinitions (what does the requirement stand for?), motiva-\\ntions (why is the requirement relevant for trustworthiness?)\\nand a short glimpse at methodologies (how can the require-\\nment be met in AI-based systems?) will be given for each of\\nthese requirements in their respective sections.\\n5.1. The three pillars of trustworthy Artificial\\nIntelligence\\nIn general, a pillar can be understood as a fundamental\\ntruth of a given idea or concept, from which key require-\\nments to realize the idea can be formulated. Similarly to\\nconstruction engineering, pillars are essential for building\\nup the concept of trustworthy AI: each pillar is necessary\\nbut not sufficient on its own to achieve trustworthy AI. Key\\nrequirements can contribute to one or several pillars, just\\nlike construction elements such as concrete, formwork or\\ncantilevers are used to help pillars support the structure\\nof the building. These requirements must be continuously\\nensured throughout the entire life cycle of AI systems,\\nthrough methodologies that must not only be technical, but\\nalso involve human interaction.\\nAccording to the EU Ethical Guidelines for Trustworthy\\nAI [3], pillars of trustworthy AI systems are defined as three\\nbasic properties that such systems should possess:\\n• Pillar 1: Lawful. Trustworthy AI systems should comply\\nwith applicable laws and regulations, both horizontally\\n(i.e. the European General Data Protection Regulation)\\nand vertically (namely, domain-specific rules that are\\nimposed in certain high-risk application domains, such as\\nmedical or finance).\\n• Pillar 2: Ethical. Besides their compliance with the law,\\ntrustworthy AI systems should also adhere to ethical prin-\\nciples and values. The rapid technological development\\nof current AI-based system rises ethical questions that\\nare not always addressed synchronously by regulatory\\nefforts. The democratized usage of large language models\\nand misinformation using deepfakes are two avant-garde\\nexponents of the relevance of Ethics as one of the pillars\\nof trustworthy AI.\\n• Pillar 3: Robust. Trustworthy AI systems should guarantee\\nthat they will not cause any unintentional harm, working\\nin a safe and reliable fashion from both technical (perfor-\\nmance, confidence) and social (usage, context) perspec-\\ntives.\\nTrustworthy AI stands on these three pillars. Ideally, they\\nshould act in harmony and pushing in synergistic directions\\ntowards the realization of trustworthy AI. However, tensions\\nmay arise between them: for instance, what is legal is not\\nalways ethical. Conversely, ethical issues may require the\\nimposition of law amendments that become in conflict with\\nprevalent regulations. Trustworthy AI must guarantee ethi-\\ncal principles and values, obey the laws, and operate robustly\\nso as to attain its expected impact on the socioeconomic\\nenvironment in which it is applied.\\nThe above three pillars lie at the heart of the HLEG\\nguidelines [3], which establish the seven requirements for\\ntrustworthy AI. As depicted in Figure 3, each requirement\\nspans several components or dimensions in which the re-\\nquirement becomes of special relevance for the design and\\noperation of an AI-based system. Apart from recommending\\ntechnical and non technical methods, the guidelines also\\ninclude an Assessment List for Trustworthy Artificial Intel-\\nligence, ALTAI, for self-assessment of AI systems and for\\nmaking the 7 requirements operative [19].\\nThe next subsections elaborate in detail on these require-\\nments:\\n5.2. Requirement 1: Human agency and oversight\\nWHAT does it mean? AI systems should empower human\\nbeings, allowing them to make informed decisions and fos-\\ntering their fundamental rights. At the same time, proper\\noversight mechanisms need to be ensured, which can be\\nachieved through human-in-the-loop, human-on-the-loop,\\nand human-in-command approaches. In other words, AI-\\nbased systems must support human autonomy and decision\\nmaking.\\nWHY is it important for trustworthiness? This require-\\nment is necessary for autonomy and control. The unfair\\nmanipulation, deception, herding and conditioning of AI-\\nbased systems can be a threat to the individual autonomy,\\nrights and freedom of their users. Therefore, trustworthy AI\\nsystems should provide the means for the user to supervise,\\nevaluate and freely adopt/override a decision made by such\\nsystems, avoiding decisions that are automatically made\\nwithout humans being involved in the process.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 8 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nEthics\\nTrustworthy \\nArtificial Intelligence\\nLawfulness\\nRobustness\\nAccountability\\nSocietal and \\nenvironmental \\nwellbeing\\nDiversity, \\nnon-discrimina-\\ntion & fairness\\nTransparency\\nPrivacy and data \\ngovernance\\nTechnical \\nrobustness \\nand safety\\nHuman agency \\nand oversight\\nRequirement 5\\nRequirement 4\\nRequirement 3\\nRequirement 2\\nRequirement 1\\nRequirement 7\\nRequirement 6\\nSubsection 5.8\\nSubsection 5.7\\nSubsection 5.6\\nSubsection 5.5\\nSubsection 5.4\\nSubsection 5.3\\nSubsection 5.2\\nFigure 3: Pillars and requirements of Trustworthy AI [3].\\nHOW can this requirement be met in practice? Two\\ndimensions underlie this first requirement, namely, human\\nagency (Subsection 5.2.1) and human oversight (Subsection\\n5.2.2). We now analyze different methodological approaches\\nthat can be adopted in these two dimensions:\\n5.2.1. Human agency\\nMechanisms for human oversight will depend on the\\narea of application and potential risk. For the preservation\\nof human rights, human-compatible [20], human-centric\\napproaches [21, 22, 14], AI for social good [23, 7] and\\nhuman computation or interactive machine learning [24]\\nare computing paradigms aligned with this requirement.\\nHowever, more structured toolkits (along the lines of [25]\\nor C-Suite [26]) will need to be materialized for a smooth\\ndomain-specific consideration of this requirement. In terms\\nof technical tools to reach different audiences, language\\nappears as the universal means of communication among\\nhumans and machines, and thus, AI models using natural\\nlanguage processing and/or counterfactual and natural lan-\\nguage explanations [27] will be relevant to help humans\\nsupervise and take the most appropriate decision based on\\nthe output of AI systems.\\n5.2.2. Human oversight\\nDifferent degrees of human involvement in the supervi-\\nsion of AI-based systems can be specified [3]:\\n• Human-in-the-loop (HITL), which refers to the ability of\\nthe supervisor to intervene in every decision cycle of the\\nsystem being monitored [24].\\n• Human-on-the-loop (HOTL), which stands for human\\nintervention during the design and monitoring cycles of\\nthe AI-based system.\\n• Human-in-command (HIC), namely, the capability of the\\nsupervisor to oversee the overall activity of the AI system\\nincluding its broader economic, societal, legal and ethical\\nimpacts, and ensuring that decisions produced by the AI\\nsystem can be overridden by the human.\\nDepending on the application under consideration, mech-\\nanisms supporting one of the above levels of human over-\\nsight can be designed. Methods proposed so far are largely\\ndomain-specific, since user-algorithm interfaces vary de-\\npending on the capabilities and background of the supervisor\\nand the design of the AI-based solution.\\n5.3. Requirement 2: Technical robustness and\\nsafety\\nWHAT does it mean? This second requirement includes\\nseveral functionalities all aligned with the prevention of un-\\nintentional harm and the minimization of the consequences\\nof intentional harm. These include the resilience of AI-based\\nsystems (to attacks and security), ensuring fallback plans\\n(in case something goes wrong), general safety, and being\\naccurate, reliable and reproducible. Robustness and safety\\nrefer to the need of AI systems to be secure, reliable and\\nrobust enough to errors and/or inconsistencies in all phases\\nof the life cycle [28].\\nWHY is it important for trustworthiness? AI-based sys-\\ntems deployed on real-world scenarios can undergo changes\\nin their operating environment that can induce changes at\\ntheir inputs (e.g. concept drift). Likewise, such changes can\\nbe the result of the interaction of malicious users with the\\nAI-based system in an adversarial fashion. Disregarding\\nwhether such changes are intentional or not, the trustworthi-\\nness of the AI-based system is subject to the capability of the\\nmodel to mitigate the impact of these changes in their issued\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 9 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\npredictions. Likewise, in risk-critical applications trustwor-\\nthy AI systems should evaluate relevant safety measures\\nand endowed with functionalities to fall back when the AI-\\nbased system deviates from its expected behavior as per the\\nmonitored measures. Finally, reliability and reproducibility\\nconnects tightly with trustworthiness in what refers to the\\nverification of the expected operation and performance of\\nAI-based systems. When AI-based systems are to be used\\nin different contexts and deployed in different systems, such\\ncomponents are vital to ensure that the system at hand\\nresiliently accommodates the differences and particularities\\nthat may arise in each context/system, ultimately working as\\nexpected.\\nHOW can this requirement be met in practice? Method-\\nologies that can be explored to support this requirement\\ncan be analyzed over three dimensions: technical robustness\\n(Subsection 5.3.1), safety (Subsection 5.3.2) and repro-\\nducibility (Subsection 5.3.3).\\n5.3.1. Technical robustness\\nWhen dealing with an AI-based system, robustness and\\nreliability are properties that refer to the ability of the sys-\\ntem to have comparable performance on atypical data with\\nrespect to typical operational regimes [29]. Robustness can\\nbe established in the face of different circumstances: when\\nwe hope a model to be robust, it is due to the fact that\\nthe model may degrade, be perturbed or affected during its\\nfuture usage. It is desirable to have a model that is robust in\\nterms of its generalization or generative capabilities, against\\nadversarial attacks or models, or against data perturbations.\\nSystematically, several levels of robustness can be dis-\\ntinguished in AI-based systems [30]:\\n• Level 0 (no robustness or standard training): this first\\nlevel of robustness refers to the one provided by the\\nAI-based system by itself, without any risk mitigation\\nfunctionalities or additions added to its design. This level\\nconcerns generalization capabilities such as being robust\\nto distributional changes caused by spurious features or\\ndata instances. Despite the lack of specific risk mitigation\\nmeasures, some naive information provided by certain\\nnaive AI models (e.g. quantification of epistemic con-\\nfidence) can be exploited to detect when the AI-based\\nsystem is not working in its expected operational regime.\\n• Level 1 (generalization under distribution shifts): this\\nsecond level of robustness considers techniques aimed to\\nmitigate different types of changes in data. Data changes\\nor shifts include covariate shift, prior probability shift,\\nconcept drift and confounding shift, depending on the\\nchange happening in the distribution of the input vari-\\nables, the output of the model, the statistical relationship\\nbetween the inputs and outputs, or the change of a variable\\nthat influences both inputs and outputs, respectively [31].\\nIn this level we can also place the generalized frame-\\nwork of Out-of-Distribution (OoD) detection [32], which\\nrefers to anomaly detection, novelty detection and open\\nset recognition, the latter referring to the capability of\\nthe model to detect, characterize and incorporate new\\nunknown patterns to its knowledge base (e.g. new classes\\nin a classification problem). Level 1 of robustness against\\nthese data shifts can be approached by concept drift detec-\\ntion and adaptation techniques, OoD detection methods or\\nclass-incremental learning schemes, to mention a few.\\n• Level 2 (robustness against a single risk): this third worst-\\ncase robustness tackles a single point of risk, e.g., the\\npresence of adversarial examples. Assessing this level\\nrequires model inspection and intervention (e.g., active\\nmodel scanning, probing to find failure cases, adversarial\\ndefenses against different attack modes).\\n• Level 3 (robustness against multiple risks): It extends the\\nformer to multiple risks (e.g., common data corruptions,\\nspurious correlations).\\n• Level 4 (universal robustness): this level is reached by AI-\\nbased systems that are proven to be effectively robust to\\nall known risks.\\n• Level 5 (human-aligned and augmented robustness): it\\nfurthers complements level 4 by aligning human-centered\\nvalues and user feedback, automatically augmenting exist-\\ning robustness demands as per the requirements, context\\nand usage of the AI-based system. This level should be\\ntargeted by high-risk AI-powered applications.\\nThe robustness of the AI-system system should be a\\ncore part of the risk management strategy adopted by the\\nowner of the system itself, hence becoming a core part\\nof their accountability procedures. Indeed, AI maintenance\\nframeworks should ease achieving robustness and AI status\\ntracking and control through the AI life cycle [33]. Mon-\\nitoring can be produced either passively (by periodically\\nmeasuring different quantitative metrics related to robust-\\nness over the data, model, or both) or actively (emulating\\nthe circumstances under which the robustness of the model\\ncan be thought to be compromised (e.g. emulated adversarial\\nattack instances or perturbations of known samples). In\\nboth cases, AI maintenance frameworks can detect model\\ndegradation through time by detecting systematic deviations\\nof the aforementioned metrics in data and models [34].\\nInterestingly, areas currently under study in AI research aim\\nin this direction, endowing AI-based systems with the ability\\nto learn continually from infinite streams of varying data\\n[35], to quantify and communicate their confidence in their\\noutputs [36], or to characterize and consolidate new patterns\\narising from data over time [37].\\nWe end the discussion about how technical robustness\\ncan be supported in AI-based systems by highlighting the\\npotential that techniques used to address other requirements\\ncan bring to technical robustness. For instance, explainabil-\\nity techniques can help make models more robust, since they\\ncan show which features are more stable to out of distribution\\nchanges in the input or adversarial attacks. Likewise, the\\nintensity of changes needed to reach a target adversarial con-\\nfidence score in counterfactual generation can be a reliable\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 10 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nestimator of the extent to which a certain data instance can\\nbe considered to be out of distribution [38]. All in all, these\\nexamples are a few among the multiple cases in which a\\nfunctionality added to an AI-based system can simultane-\\nously contribute to several requirements for trustworthiness.\\n5.3.2. Safety\\nEvolving from a generic Information Technologies con-\\ntext, safety in AI [39, 40, 41] is developing in relation to the\\nalignment with human values. In this sense, concrete proto-\\ncols and procedures are challenging to define, but necessary\\nfor AI safety. Safety in AI concerns several unsolved research\\nissues [40], including:\\n• Attaining robustness as the objective of withstanding haz-\\nards, and building systems less vulnerable to adversarial\\nthreats such as adversarial perturbations which cause high\\nconfidence mistakes, and robust to long tails.\\n• Facilitating tools to inspect AI-based systems, identify\\nhazards and anomalies, calibrate them, identify honest\\noutputs, and detect emergent capabilities. One risk of\\nAI systems that links with the need for safety tools is\\nthat they may carry backdoors [42]: backdoored models\\nbehave correctly in nearly all scenarios, except in chosen\\nscenarios taught to behave incorrectly due to the training\\non poisoned data as a way to have backdoors injected.\\nThese are problematic, specially in foundational models\\nthat serve as the architectural backbone of downstream\\nmodels, all evolved from originally poisoned data from\\nmassive training datasets [40].\\n• Defining safety objectives in order to steer models, ei-\\nther internally (how models should learn to guarantee\\ncompliance with safety metrics) and externally (how such\\nsafety compliance can be safely pursued). Problems in this\\nregard include:\\n– Value learning, as the inability of AI systems to code\\nhuman values (e.g., happiness, sustainability, mean-\\ningful experiences or safe outcomes). Although giving\\nopen-world inputs to models can partially tell apart\\npleasant and unpleasant states, utility values of such\\nstates are no ground truth values, and are a result of\\nthe model’s own learned utility function [43].\\n– Proxy gaming: This is a phenomenon due to the fact that\\noptimizers and adversaries can manipulate objective\\nproxies. As Goodhart’s law states, a measure ceases\\nto be a reliable indicator when it becomes the target.\\nFor instance, proxy gaming occurs as reward hacking in\\nreinforcement learning. Similarly, objective countable\\nmetrics end up substituting human values when opaque\\nAI models are forced to learn by optimizing a single\\nquantitative measure13. Therefore, merely acquiring a\\nproxy for human values is insufficient: models must\\nalso be resilient to solvers seeking to manipulate it.\\n13These are also known as weapons of math destruction [44] that may\\ncontain pernicious feedback loops that perpetuate stereotypes and biases\\n[45] if they do not consider context nor a concrete person’s features, but\\nrather those of its neighbors.\\n5.3.3. Reproducibility\\nOnce robustness and safety have been addressed, an\\nimportant dimension in this key requirement for trustworthy\\nAI is reproducibility. It can be defined as the ability of AI\\nexperiments to exhibit the same behavior when repeated\\nunder the same conditions. Reproducibility is related to\\nreplicability, which refers to the capability to independently\\nachieve non-identical conclusions that are at least similar\\nwhile differences in sampling, research procedures and data\\nanalyses may exist [14]. Since both concepts are essential\\nparts of the scientific method, the National Information Stan-\\ndards Organization (NISO) and the Association for Comput-\\ning Machinery (ACM) redefine these concepts as:\\n• Repeatability (same team, same experimental setup),\\nwhich means that an individual or a team of individuals\\ncan reliably repeat his/her/their own experiment.\\n• Replicability (different team, same experimental setup):\\nan independent group of individuals can obtain the same\\nresult using artifacts which they independently develop in\\ntheir entirety.\\n• Reproducibility (different team, different experimental\\nsetup with stated precision): a different independent group\\ncan obtain the same result using their own artifacts.\\nIt should be clear that when formulated in the context\\nof trustworthy AI systems, one should regard an experiment\\nin the above definitions as the performance, robustness and\\nsafety evaluation of a given AI-based system. This evalua-\\ntion can be done by different groups (as in research) or by\\na certification lab (as in commercial software-based solu-\\ntions). The extent to which reproducibility can be guaranteed\\nin trustworthy AI systems depends on the confidentiality\\nof the system or the singularity of the experimental setup\\nfor which the AI-based system was developed, among other\\nconstraining circumstances. For instance, in mild contexts\\n(as in research), reproducibility of experiments by third\\nparties is often favored by public releases of the source code\\nimplementing the AI-based system being proposed.\\n5.4. Requirement 3: Privacy and data governance\\nWHAT does it mean? This requirements assures the re-\\nspect for privacy and data protection thorough the AI system\\nlife cyle (design, training, testing, deployment and oper-\\nation), adequate data governance mechanisms taking into\\naccount the quality and integrity of the data and its relevance\\nto the domain, and also ensures legitimized access to data\\nand processing protocols.\\nWHY is it important for trustworthiness? AI systems\\nbased on digital records of human behavior can be capa-\\nble of inferring individual preferences and reveal personal\\nsensitive information such as the sexual orientation, age,\\ngender, religious or political views. Since AI-based systems\\nlearn from data, systems must guarantee that such personal\\ninformation is not revealed while data is processed, stored\\nand retrieved throughout the AI life cycle, facilitating means\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 11 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nto trace how data is used (governance) and verifying that\\nprotected information is not accessed (privacy awareness)\\nduring the life cycle phases. If such guarantees are not pro-\\nvided, AI-based systems will not be trusted by end users, nor\\nwill they conform to existing legislation (e.g. the European\\nGDPR). Citizens should have full control over their data,\\nand their data will not be unlawfully or unfairly used to\\nharm or discriminate against them [28]. This requirement\\nis important to preserve human rights such as the right\\nto privacy, intimacy, dignity or the right to be forgotten.\\nKeeping the usage and scope of the data limited, protected\\nand informed is paramount, since digital information can\\nbe used towards clustering a person into profiles that may\\nnot reflect reality, while reinforcing stereotypes, historical\\ndifferences among minorities, or perpetuate historical or\\ncultural biases [44].\\nHOW can this requirement be met in practice? In the\\nfollowing we analyze technologies that can maintain data\\nprivacy in AI-based systems (Subsection 5.4.1), and strate-\\ngies to deal with data governance as quality and integrity\\nprocesses (Subsection 5.4.2).\\n5.4.1. Data privacy\\nIn order to land down the data privacy requirement into\\nactual technologies, we emphasize the relevance of Feder-\\nated learning (FL) [46, 47], homomorphic computing [48]\\nand differential privacy (DP) [49] as examples of privacy-\\naware technologies in the current AI landscape:\\n• In FL, a model is trained across multiple decentralized\\ndevices without moving the data to a central location. In\\ndoing so, instead of delivering all the data to a central\\nserver, devices learn models locally using their own data,\\nso that only numerical model updates are sent to the\\ncentral server. The central server aggregates the updated\\nmodel parameters from all the devices or servers to create\\na new model. This allows learning a global model lever-\\naging all data in situations where the data is sensitive.\\nBesides preserving the privacy of local data, FL can\\nreduce communication costs and accelerate the model\\ntraining.\\n• In homomorphic computing, data can be processed in\\nencrypted form without the need for deciphering it first.\\nAs a result, data remains secure and private by performing\\noperations directly on encrypted data. By using specially\\ndevised mathematical operations, the underlying structure\\nof data is preserved while it is processed, so that the result\\nof the computation, which is also encrypted, stays the\\nsame. Only authorized parties having the decryption key\\ncan access this information. Homomorphic computing\\ncan be an effective way to implement privacy-aware pre-\\nprocessing, training and inference in AI-based systems.\\n• Finally, DP enables processing and learning from data\\nwhile minimizing the risk of identifying individuals in\\nthe dataset at hand. To this end, DP injects random noise\\nto the data before it is processed. This noise is calibrated\\nto guarantee that the data remains statistically accurate,\\nwhile concealing any information that could be used to\\nidentify individuals and thereby, compromise their pri-\\nvacy. The amount of noise added to data balances between\\nthe level of privacy protection provided by DP and the\\nperformance degradation of the AI-based system when\\ncompared to the case when no noise is injected.\\nBy resorting to any of the above technologies (also\\ncombinations of them), the privacy of individuals in the\\ndatasets is preserved, minimizing their risk of harm.\\n5.4.2. Data governance: Quality and integrity of data\\nand access to data\\nData protocols must govern data integrity and access for\\nall individuals even if these are not users of the AI system.\\nOnly duly qualified staff, with explicit need and competence,\\nshould be allowed to access individuals’ data. As a part of AI\\ngovernance, data governance calls for a broader level regula-\\ntion than a single country or continent regulation. This con-\\ntext has motivated guidelines and recommendations for AI\\ngovernance over the years emphasizing on the importance\\nof ensuring data quality, integrity and access. An example\\ncan be found in the Universal Guidelines for AI published\\nin 2018 [50], which were endorsed by 70 organizations and\\n300 experts across 40 countries. In these guidelines, Data\\nQuality Obligation was established as one of the principles\\nthat should be incorporated into ethical standards, adopted\\nin regulations and international agreements, and embedded\\ninto the design of AI-based systems. These recommenda-\\ntions helped inform the OECD AI Principles (2019), the\\nUNESCO Recommendation on AI Ethics (2021), the OSTP\\nAI Bill of Rights (2022), and the EU AI Act and the Council\\nof Europe Convention on AI.\\nIn terms of guidelines to implement data governance,\\nthe Information Commissioner’s Officer (ICO) has proposed\\nrecommendations on how to use AI and personal data appro-\\npriately and lawfully [51]. Among these, there are actions\\nsuch as taking a risk-based approach when developing and\\ndeploying AI – “addressing risk of bias and discrimination at\\nan early stage”, “ensuring that human reviews of decisions\\nmade by AI is meaningful”, “collect only data needed and\\nno more”, and “working with external suppliers to ensure\\nthe use of AI will be appropriate”.\\nAt the European level, the European Strategy for Data\\nestablished in 2020 aims at making the EU a role model\\nfor a society empowered by data. This strategy has given\\nrise to the European Data Governance Act [52] to facilitate\\ndata sharing across sectors and Member States. In particular,\\nthe EU Data Governance Act intends to make public sector\\ndata available for re-use, promote data sharing among busi-\\nnesses, allow the use of personal data through a “personal\\ndata-sharing intermediary”, help exercising rights under the\\nGeneral Data Protection Regulation (GDPR), and allowing\\ndata use on altruistic grounds [52].\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 12 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nLater in 2022, the European Union strategy for data\\nproposed the Data Act [53]14, a regulation harmonizing rules\\non fair access to and use of data. In practice, this regulation\\ncomplements the Data Governance Act by specifying who\\ncan create value from data and under which circumstances.\\nIn practice, the Data Act will take action to 1) increase legal\\ncertainty for companies and consumers who generate data,\\non who can use what data and under which conditions, 2)\\nprevent abuse of contractual imbalances that hinder fair data\\nsharing. 3) provide means to the public sector to access data\\nof interest held by the private sector; 4) set the framework\\nconditions for customers. Therefore, the benefits of the Data\\nAct for consumers and business include, from achieving\\ncheaper aftermarket prices for connected objects, to new\\nopportunities to use services based on data access, and\\nbetter access to data produced by devices. Serving these two\\nEU regulations, ten European common data spaces exist,\\nranging from industry to mobility .\\n5.5. Requirement 4: Transparency\\nWHAT does it mean? Transparency is the property that\\nensures appropriate information reaches the relevant stake-\\nholders [29]. When it comes to AI-based systems, differ-\\nent levels of transparency can be distinguished [54]: sim-\\nulatability (of the model by a human), its decomposability\\n(the ability to explain the model behavior and its parts),\\nand algorithmic transparency (understanding the process of\\nthe model and how it will act for any output). Another\\nclassification establishes transparency at the algorithmic,\\ninteraction and social levels [55], emphasizing the role of the\\nstakeholder audience to which the explanation is targeted:\\ndeveloper, designer, owner, user, regulator or society.\\nWHY is it important for trustworthiness? In the context\\nof trustworthy AI systems, data, the system itself and AI\\nbusiness models should be transparent. Humans must be\\ninformed of systems capabilities and limitations and always\\nbe aware that they are interacting with AI systems [3].\\nTherefore, explanations should be timely, adapted and com-\\nmunicated to the stakeholder audience concerned (layperson\\nregulator, researcher or other stakeholder), and traceability\\nof AI systems should be ensured.\\nHOW can this requirement be met in practice? The\\ndimensions to be treated within this requirement concern\\ntraceability, explainability and communication, which are\\nessential for realizing transparent AI-based systems. In the\\nfollowing we will first explain what traceability stands for\\n(Subsection 5.5.1), the current state of the art on explainable\\nAI (Subsection 5.5.2), and mechanisms for communicating\\nAI systems decisions (Subsection 5.5.3).\\n14Data Act & Data Act Factsheet, https://digital-strategy.ec.europ\\na.eu/en/policies/data-act, accessed on April 25th, 2023.\\n5.5.1. Traceability\\nTraceability is defined as the set of mechanisms and\\nprocedures aimed to keep track of the system’s data, devel-\\nopment and deployment processes, normally through doc-\\numented recorded identification [14]. Traceability and log-\\nging from the early design stages of the AI-based systems\\ncan help auditing and achieving the required level of trans-\\nparency according to the needs of the concerned audience.\\nIn this regard, special attention must be paid to prove-\\nnance tools [56], which ease the traceability or lineage of\\ndata and model decisions, hence contributing to the require-\\nment of transparency. In this area, the use of Blockchain\\nmechanisms are promising towards guaranteeing the in-\\ntegrity of data used to train (and explain) machine learning\\nmodels, i.e., the provenance of databases, their associated\\nquality, bias and fairness.\\n5.5.2. Explainability\\nThe so-called eXplainable AI (XAI) [54] field is widely\\nand globally being recognized as a crucial feature for the\\npractical deployment of trustworthy AI models. Existing\\nliterature and contributions already made in this field include\\nbroad insights into what is yet to be achieved [54, 57, 58].\\nEfforts have been invested towards defining explainability\\nin machine learning, extending previous conceptual propo-\\nsitions and requirements for responsible AI by focusing on\\nthe role of the particular audience for which explanations are\\nto be generated [54]: Given an audience, an explainable AI is\\none that produces details or reasons to make its functioning\\nclear or easy to understand.\\nExplainability techniques are blooming as tools to sup-\\nport algorithmic auditing. They have emerged as a necessary\\nstep to validate and understand the knowledge captured by\\nblack-box models, i.e., a system in which only inputs and\\noutputs are observed without knowing the internal details\\nof how it works. This can be problematic, as we cannot\\npredict how the system may behave in unexpected situations\\n(connecting with the technical robustness requirement, Sub-\\nsection 5.3), or how it can be corrected if something goes\\nwrong (linked to the accountability requirement, Subsec-\\ntion 5.8). Explaining which input factors contribute to the\\ndecisions of complex black-box algorithms can provide a\\nuseful global view of how the model works, jointly with\\ntraceability methods and a clear and adapted communication\\nof information to the target audience.\\nSince the quality of explanations depends on the audi-\\nence and the motivation for which they are generated, several\\ntaxonomies of XAI techniques have been proposed over\\nthe years [54]. A primary distinction can be done between\\nmodel-agnostic and model-specific approaches to explaining\\nmachine learning models, the difference being whether the\\nXAI technique can be applied to any machine learning model\\ndisregarding their structure and learning algorithm. Another\\ndistinction can be done between ex-ante and post-hoc XAI\\ntechniques, depending on the moment at which explainabil-\\nity is addressed (before or after the model is designed and\\ntrained). On one hand, ex-ante techniques (also referred to as\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 13 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nthe explainable-by-design paradigm) make AI models aspire\\nto provide an explanation that avoids the construction of\\nadditional models or extra complexity (layers or mechanisms\\nnot originally part of the original one), so that explanations\\nare as faithful to the real reasoning carried out by the model\\nas possible. On the other hand, post-hoc XAI techniques\\nusually add artifacts around the original AI model or build a\\nsurrogate of it – a local approximation or simpler version\\nof the original one – in order to more easily explain the\\noriginal one (for example, LIME [59]). Likewise, some XAI\\ntechniques may use external knowledge (e.g. from the web,\\nWikipedia, forums) [60], for instance, to explain language\\nmodels or dialogue models that interactively answer ques-\\ntions about a model’s particular decision.\\nOther criteria to categorize XAI tools can be formulated,\\nsuch as the format of the issued explanations (e.g., attri-\\nbution methods [61], counterfactual studies [62], simplified\\nmodel surrogates [63]) or the hybridization of explanations\\nexpressed in different modalities, such as visual and textual\\n(e.g., linguistic summaries [64], ontologies [65], or logical\\nrules defined on top of knowledge graphs [66], to cite a\\nfew). Natural language explanations [67, 27], quantitative\\nmeasures of the quality of explanations [68, 69], and models\\nthat support their learning process with formal symbolic\\nbasis such as language, rules, compositional relationships or\\nknowledge graphs (neural-symbolic learning and reasoning\\n[66]) are key for explanations to be understood by non-expert\\naudience. These interfaces allow such users to assess the\\noperation of the model in a more intelligible fashion, hence\\nsupporting the human agency and oversight requirement for\\ntrustworthy AI systems (Subsection 5.2).\\n5.5.3. Communication\\nThe third dimension of transparency is how the audience\\nis informed about the AI-based system, namely, how expla-\\nnations or information tracked about the system’s operation\\nis communicated to the user. Humans should know when\\nthey are interacting with AI systems, as well as be notified\\nabout their performance, instructed about their capabilities,\\nand warned about their limitations. The same holds when\\nconveying the model’s output explanation and its function-\\ning to the user. The adaptation of the explanation must be in\\naccordance to the specifics of the AI system being explained\\nand the cognitive capabilities (knowledge, background ex-\\npertise) of the audience.\\nTherefore, communication is a crucial dimension, so\\nthat all aspects related to transparency are delivered to the\\naudience in a form and format adapted to their background\\nand knowledge. This is key to attain trust in the audience\\nabout the AI-based system at hand.\\n5.6. Requirement 5: Diversity, non-discrimination\\nand fairness\\nWHAT does it mean? This requirement contemplates dif-\\nferent dimensions: the avoidance of unfair bias, diversity\\nfostering, accessibility to all regardless any disability, and\\nthe involvement of stakeholders in the entire AI system life\\ncycle. All these dimensions of this manifold requirement\\nshare an ultimate purpose: to ensure that AI-based systems\\ndo not deceive humans nor limit their freedom of choice\\nwithout reason. Therefore, it is a requirement tightly linked\\nto the ethical and fairness principles that underlie trustwor-\\nthiness in AI (Section 2).\\nWHY is it important for trustworthiness? This require-\\nment is necessary to broaden the impact of AI to all social\\nsubstrates, as well as to minimize the negative effects that\\nautomated decisions may have in practice if data inherits\\nbiases hidden in the data from which models are learned.\\nUnfair bias in data must be avoided as decisions drawn by a\\nmodel learned from such data could have multiple negative\\nimplications, including the marginalization of vulnerable\\ngroups, the exacerbation of prejudice or discrimination [3].\\nHOW can this requirement be met in practice? Meth-\\nods to tackle this requirement can be classified depending on\\nthe specific dimension they support: as such, methods to en-\\nforce diversity, non-discrimination, accessibility, universal\\ndesign and stakeholder participation are briefly revisited in\\nSubsection 5.6.1, whereas Subsection 5.6.2 describes how to\\nachieve fairness in AI-based systems. Finally, Section 5.6.3\\nexamines mechanisms to avoid unfair bias.\\n5.6.1. Diversity, non-discrimination, accessibility,\\nuniversal design and stakeholder participation\\nAI systems should take into account all human abilities,\\nskills and requirements, and ensure accessibility to them.\\nDeveloping methodologies based on the requirement of non-\\ndiscrimination and bias mitigation is paramount to ensure\\nthe alignment of requirements to the compliance with ethical\\nvalues. Assuring properties of non-discrimination, fairness\\nand bias mitigation restrict the systematic differences treat-\\ning certain groups (of people or objects) with respect to\\nothers [29]. A practical example of recommendation encour-\\nages, e.g., hiring from diverse backgrounds, cultures and\\ndisciplines to assure opinion diversity.\\nThis requirement involves the inclusion of diverse data\\nand people, and ensures that individuals at risk of exclusion\\nhave equal access to AI benefits. Concrete implementations\\nof this requirement range from quantifying the impact of\\ndemographic imbalance [70] and the effects of missing data\\n(which, as a matter of fact, has been shown to be beneficial\\nin terms of fairness metrics [71]).\\nIn what refers to diversity, it advocates for the needs for\\nheterogeneous and randomly sampling procedures for data\\nacquisition, diverse representation of a population that in-\\ncludes minorities, and the assurance for non-discriminating\\nautomated processes that lead to unfairness or biased mod-\\nels. Diversity can be applied at the technical level dur-\\ning model training by penalizing the lack of diverse pro-\\ntotypes on latent space areas with challenging separation\\nbetween classes [72]. Alternatively, the lack of diversity\\ncan be counteracted by means of imbalanced learning or\\nby informing data augmentation. When placing the focus\\non the solutions of an AI-based system, their diversity is a\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 14 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nvery relevant component to guarantee non-biased outcomes.\\nFrameworks unifying quality and diversity optimization can\\nguarantee the diversity of generated solutions that may later\\nserve in robotics to learn behaviorally diverse policies [73].\\nFrom a global perspective, the so-called discrimination-\\nconscious by-design paradigm collective refers to method-\\nologies where discrimination detection and prevention is\\nconsidered from the beginning of the design of the AI-\\nbased system through fairness [74]. Methods adopting this\\nparadigm include discrimination-aware data mining [75],\\ncompositional fairness, interpretation of sanity checks and\\nablation studies.\\nIn summary, diversity must be enforced both in the data\\nfrom which models are learned and among the stakeholders,\\ni.e., fostering the inclusion of minorities (practitioners and\\nusers) [7, 22] or using methodologies such as participatory\\ndesign for accessibility [76]. Universal Design principles,\\nwhich consider accessibility and “design for all” [3] dur-\\ning development, governance, policy and decision making\\nprocesses is one way to facilitate AI life cycles that take\\ninto account what is beneficial for everyone, accounting for\\ndifferent conditions and situations, and no discrimination.\\nTo further enable universal design and stakeholder par-\\nticipation, often feedback – even after deployment – is\\nsought for stakeholder participation and consultation. One\\nway to achieve this is through active learning for machine\\nlearning systems. Active learning allows for the integration\\nof users’ feedback while models are learned, and enables\\ninteractivity with the user, one of the goals targeted by\\nhuman-centered AI [77] and AI for social good [23].\\n5.6.2. Fairness\\nThe second dimension of this requirement is fairness,\\nnamely, techniques aimed to reduce the presence of unfair\\noutputs elicited by AI-based systems. An unfair algorithm\\ncan be defined as that producing decisions that favor a partic-\\nular group of people. Following the comprehensive view on\\nthis topic published in [78], biases leading to such unfair de-\\ncisions can be propagated from the data to the AI algorithm\\n(including measurement, omitted variable sampling, or rep-\\nresentation biases, among others); from the algorithm to the\\nuser (as in algorithmic, popularity or evaluation biases); or\\nfrom the user to the data (respectively, biases induced in\\nthe production of content, temporal, historical and/or social\\nbiases).\\nFairness guarantees in the decisions of AI-based systems\\nhas been approached extensively in the literature, reporting\\nbias targeting methods that can be classified in three large\\ngroups:\\n• Pre-processing methods, where the available data are\\ntransformed for the source of bias to be reduced and at\\nbest, removed.\\n• In-processing methods, which modify the learning algo-\\nrithm of the model at hand (by e.g. changing the objective\\nfunction at hand or imposing constraints to the optimiza-\\ntion problem) so as to minimize the effect of biases in the\\ntraining process.\\n• Post-processing methods, which operate on the output\\nof the model (for instance, by reassigning the predicted\\nclass for a query instance) without modifying its learning\\nalgorithm or the training data from which it was learned.\\nIn general, it is widely acknowledged that fairness can\\nbe achieved by sacrificing accuracy to a certain extent [79].\\nHowever, it is also possible to debias machine learning mod-\\nels from the influence of spurious features to even improve\\ntheir performance [80]. Another trade-off is between fairness\\nand privacy. Here, adversarial learning [81] can simultane-\\nously learn a predictor and an adversary that models a pro-\\ntected variable, and by minimizing the adversary capacity to\\npredict this protected variable, accurate predictions can show\\nless stereotyping of the protected variable, almost achieving\\nequality of odds as a fairness notion.\\nAn important concept to be acquainted with when deal-\\ning with fairness in AI-based systems is fairwashing: as a\\nrisk of rationalization, fairwashing is the promotion of a false\\nperception that a machine learning model respects ethical\\nvalues through an outcome explanation and fairness metric\\n[82]. This makes it critical to characterize the manipulability\\nof fairwashing [83], as well as LaundryML approaches [82]\\nto better audit unfair opaque models.\\n5.6.3. Avoidance of unfair bias\\nData and models can be exposed to a large set of potential\\nbias-inducing phenomena. Ensuring diversity, representa-\\ntiveness and completeness in data and models needs to be\\na core part of the full AI life cycle (design, development\\nand deployment phases of AI-based systems). Bias can be\\nuncovered through proxy discrimination by models, since\\nproxy variables are likely to be picked up, showing features\\nas proxy that otherwise would not have been considered, i.e.,\\nzip codes in predictive policing [44]. As has been shown in\\nthe previous dimension, bias is not only algorithmic, but ex-\\ntends beyond the limits of models in a vicious cycle: starting\\nwith human activity bias, data bias, leads to sampling bias\\non the web (specially to be considered in the use of data to\\nlearn generative models), algorithmic bias, interaction bias\\nand finally, self-selection bias that can revert back into the\\nalgorithm a second-order bias [84].\\nBias mitigation techniques include several approaches\\n[45, 85], from generic requirements and toolboxes [86] to\\nconcrete taxonomies of bias [87, 88] at different stages of\\nthe AI life cycle [89]. Different notions of fairness can be\\nalso defined [78, 90], including causal fairness – which relies\\non causal relations and requires establishing causal graphs\\n– or counterfactual fairness. Causality can help debugging\\nalgorithmic bias mitigation or explaining models [91], e.g.,\\ncausal mediation analysis can help uncover disparate impact\\nof models by estimating the fairness associated to different\\nexplaining variables [92].\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 15 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n5.7. Requirement 6: Societal and environmental\\nwellbeing\\nWHAT does it mean? AI-based systems should benefit all\\nhumankind, not only at the present time but also in future\\ngenerations. Therefore, AI-based systems must be sustain-\\nable and environmentally friendly, so that the technological\\nadoption of AI does not entail a progressive depletion of\\nnatural resources and maintains an ecological balance [29].\\nTherefore, dimensions supporting this requirement include\\nsustainability and environmental friendliness, as well as a\\ncareful assessment of the social and societal impacts of AI.\\nWHY is it important for trustworthiness? AI systems\\nshould increase positive social change and enhance sus-\\ntainability and ecological responsibility. Although they can\\nbe an effective tool to mitigate climate change [93, 94],\\ngreenhouse gases emitted by the computationally intensive\\ntraining processes of complex AI-based systems can exacer-\\nbate existing social and ethical challenges linked to AI [95].\\nFor instance, training only one single AI model can emit\\nas many CO2 emissions as five cars in their entire lifetime.\\nComputational and environmental costs grow proportionally\\nto the complexity of the model in terms of its number\\nof parameters [96]. In particular, this study was done for\\nlarge language models [97] which cost about 8.4 tons per\\nyear, where an average carbon footprint for a person yearly\\nis around 4. Although emissions are amortized over the\\nmodel lifetime, the recent ChatGPT model was estimated\\nto consume 1,287 MWh that translates into a cost of 522\\ntCO2e [98]. Therefore, energy and policy considerations\\nare to be taken into account by institutions and companies\\nimplementing AI [97].\\nHOW can this requirement be met in practice? This\\nrequirement is currently approached from two different an-\\ngles: sustainability and environmental wellbeing (Subsec-\\ntion 5.7.1) and societal wellbeing (Subsection 5.7.2).\\n5.7.1. Sustainability and environmental wellbeing\\nSustainable AI [99] considers a holistic perspective that\\nspans from models to data algorithms and hardware, and\\nhow software-hardware co-design can help mitigate carbon\\nfootprints of AI model life cycles (design, training and de-\\nployment stages). As mentioned previously, sustainable AI\\nfinds its motivation in the costly energy consumption of large\\nAI models. Thus, sharing key learned lessons, best design\\npractices, metrics, and standards is key for a sustainable\\ndevelopment of AI systems. Technical contributions aimed\\nto implement this requirement for the sustainability of AI\\nare at the core of the Green AI research area [100], which\\nstudies efficient and ecologically aware designs of AI-based\\nalgorithms, systems and assets.\\nMany strategies to attain this requirement have been\\nproposed over the years to reduce the environmental impact\\nof AI models, with emphasis on those characterized by\\na large number of parameters and requiring long training\\nlatencies (e.g., deep neural networks). Among others:\\n• Assessment of the environmental impact of AI-based\\nsystems with e.g., carbon footprint calculators15 [101].\\nEvaluating the factors that influence AI’s greenhouse gas\\nemissions is the first step towards mitigating its negative\\neffects [95].\\n• Selection of the most relevant and necessary data, i.e.,\\nwith smart data approaches [102].\\n• Model compression [103, 104], e.g. using quantization\\n[105], distillation techniques [106, 107] or acceleration\\n[108] techniques.\\n• Consideration of efficiency as an evaluation metric and as\\na price tag to make models greener and more inclusive for\\nresearchers having limited resources [100].\\n• Use of models that can rapidly adapt to new situations,\\ndomains and similar tasks by virtue of learning function-\\nalities specifically devoted to this adaptation (e.g., mul-\\ntitask, few-shot learning, AutoML, meta-learning, neural\\narchitecture search or open-ended learning. This family\\nof GPAIS can provide more efficient, sustainable and less\\ndata depending AI systems.\\n• Deployment of models on cloud computing servers fed\\nwith renewable energy sources, to minimize CO2 emis-\\nsions.\\n5.7.2. Societal wellbeing\\nAt the societal level, AI can improve social welfare. AI-\\nbased systems can perform routine tasks in an autonomous\\nsafer, and more efficient fashion, enhancing productivity\\nand improving the quality of life of humankind. In the\\npublic administration AI can speed up processes, smooth ad-\\nministrative bottlenecks and save paperwork. Furthermore,\\nit can aid policy making and help city planners, e.g., by\\nvisualizing the consequences of climate change, predicting\\nfuture floods, or identifying urban heat islands. Possibilities\\nfor the society at large to benefit from AI developments have\\nexploded in recent years with the progressive digitization of\\nalmost all sectors of activity. Infrastructure planning, health\\nand hunger, equality and inclusion, education, economic\\nempowerment, security and justice are among those sectors\\nwhere AI can unleash its full potential to foster use cases of\\nsocietal impact.\\nBringing such benefits of AI into practice is, therefore,\\na matter of leveraging such amounts of available data in\\nAI-based systems. Such AI-based systems address learning\\ntasks that solve a problem of societal impact, such as the\\nones exemplified above. However, since decisions issued by\\nthe AI-based system affect human beings and are subject\\nto social scrutiny, other requirements of trustworthy AI\\nbecome of utmost relevance, including fairness, privacy,\\ntransparency or human oversight. Above all, the importance\\nof AI ethics and regulation becomes paramount in societal\\n15Greenhouse Gases Equivalencies Calculator, https://www.epa.gov/en\\nergy/greenhouse-gases-equivalencies-calculator-calculations-and-refer\\nences, accessed on April 25th, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 16 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nwellbeing, since decisions issued in use cases arising in edu-\\ncation, justice and security have to comply with fundamental\\nhuman rights and the legal restrictions in force.\\n5.8. Requirement 7: Accountability\\nWHAT does it mean? This last requirement of trustworthy\\nAI systems imposes the provision of mechanisms to ensure\\nresponsibility and accountability for the development, de-\\nployment, maintenance and-or use of AI systems and their\\noutcomes. Auditability, which enables the assessment of\\nalgorithms, data and design processes, plays a key role in\\naccountability, namely, the attribution of the results of the\\nactions that were taken based on the outcome of the AI-\\nbased system. Accountability, therefore, implies the min-\\nimization of harm and reporting of negative impact, the\\ncommunication of design trade-offs to the user, and the\\nimplementation of adequate and accessible redress strategies\\nassociated to AI-based systems. Therefore, auditability and\\naccountability are closely related to each other and lie at the\\ncore of responsible AI systems, which are later discussed in\\nSection 6.1.\\nWHY is it important for trustworthiness? The required\\nauditability property of Trustworthy AI systems demands\\nthe development of practical tools [109] that are capable\\nof verifying desirable properties of neural networks such\\nas stability, sensitivity, relevance or reachability [110], as\\nwell as metrics beyond explainability [111, 112, 113, 114,\\n115], such as on traceability, data quality and integrity.\\nAuditability is becoming increasingly important when stan-\\ndards are being materialized touching upon all AI require-\\nments. This includes IEEE, ISO/IEC and CEN/CENELEC,\\nwhich are implementing concrete guidelines to apply trust-\\nworthy AI requirements in industrial setups (see [29, 11]\\nfor an overview). At the national level, the German stan-\\ndardization road map on AI within DIN/DKE [11] is a\\nclear exponent of the standardization efforts made by differ-\\nent governments to dictate how practical AI-based systems\\nshould be audited.\\nOn the other hand, accountability is a key requirement\\nto be able to recourse [116] when an AI model contributes\\nto making a proven wrong decision, issuing explanations\\nand recommendations to cases that are unfavorably treated\\nby such decision. Accountability is a matter of compliance\\nwith ethical and legal standards, answerability, reporting and\\noversight, and attribution and enforcement of consequences\\n[117]. Therefore, when framed under AI regulatory stan-\\ndards and ethical principles like the ones discussed in this\\nwork, accountability becomes crucial for AI-based systems\\nto distribute cost, risks, burdens and liabilities among the\\ndifferent stakeholders participating in its life cycle.\\nHOW can this requirement be met in practice? Simi-\\nlarly to other requirements, we next analyze how the different\\ndimensions spanned by this requirement can be tackled\\nin practice. In doing so, Subsection 5.8.1 deals with ac-\\ncountability, whereas Subsection 5.8.2 addresses auditabil-\\nity. The minimization and reporting of negative impacts\\nis discussed in Subsection 5.8.3. Finally, Subsection 5.8.4\\ndescribes methods for algorithmic redress.\\n5.8.1. Accountability\\nMechanisms of accountability are especially relevant in\\nhigh-risk scenarios, as they assign responsibility for deci-\\nsions in the design, development and deployment phases\\nof the AI system. Tools to attain this requirement involve\\nalgorithmic accountability policy toolkits (e.g., [118]), the\\npost-hoc analysis of the output of the model (e.g. via local\\nrelevance attribution methods) or algorithms for causal in-\\nference and reasoning [119]. Since accountability is linked\\nto the principle of fairness, it is closely related to risk man-\\nagement since unfair adverse effects can occur. Therefore,\\nrisks must be identified and mitigated transparently so they\\ncan be explained to and verified by third parties. Therefore,\\ntechniques and tools for auditing data, algorithms and de-\\nsign processes are required for accountable decisions issued\\nby AI-based systems. An overview on 16 risk assessment\\nframeworks is available in [120], whereas built-in derisking\\nprocesses at design and development phases can be found in\\n[86, 121]. These processes operationalize risk management\\nin machine learning pipelines, including explainability and\\nbias mitigation. Another set of resources to tackle bias and\\nfairness are discussed in [86].\\nEmerging trade-offs between requirements should be\\nstated and assessed with regards to the risk they pose to\\nethical requirements and compromise of fundamental rights,\\nsince no AI system should be used when no risk-free trade-\\noff for these can be found [3]. Consequently, AI models use-\\nful for accountability often involve multi-criteria decision\\nmaking and pipelines at the MLOps level that help delineate\\nand inform such trade-offs to the user.\\n5.8.2. Auditability\\nThe AI Act has been interpreted as the European ecosys-\\ntem to conduct AI auditing [122]. In the strict sense, the\\nneed for certifying systems that embed AI-based function-\\nalities in their design is starting to permeate even within\\nthe international ISO standards for AI robustness. In such\\nstandards, formal methods for requirement verification or\\nrequirement satisfaction, typical of software engineering,\\nare being extended towards verifying desirable properties\\nof AI models. More specifically, in order to certify neural\\nnetworks, properties such as stability, sensitivity, relevance\\nor reachability are sought [110].\\nIn terms of auditing procedures, especially when the AI\\nsystem interacts with users, grading schemes adapted to the\\nuse case [123] are in need for validating models. Examples\\ninclude the System Causability Scale [123] or the Muir\\nTrust Scale [124], which are widely adopted in human robot\\ninteraction and robotics and rely on predictability (To what\\nextent the robot behavior [the output of the AI-based system]\\ncan be predicted from moment to moment?), reliability (To\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 17 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nwhat extent can you count on the system to do its job?),\\ncompetence (What degree of faith does the user have on the\\nsystem for it to cope with similar situations in the future?)\\nand trust (How much does the user trust the system overall?).\\n5.8.3. Minimization and reporting of negative impacts\\nand trade-offs\\nThe urgent need for developing stable and verifiable\\nmechanisms for auditing AI-based systems becomes more\\nrelevant in the case of generative AI, which has grown so\\nmaturely that it is difficult to distinguish between human-\\ncreated multimodal content and those generated by ma-\\nchines. If these are not properly identified, they can gen-\\nerate confusion and deception, which may have negative\\nconsequences for society, such as the manipulation of public\\nopinion or the dissemination of fake news.\\nA promising stream along these lines proposes to land\\nthe implementation of verifiable claims [125], which are\\ndefined as those falsifiable claims for which evidence and\\narguments can be provided to influence the probability that\\nsuch claims are true. This proposal stems from the efforts\\nof developers, regulators and other AI stakeholders, and the\\nneed to understand what properties of AI systems can be\\ncredibly demonstrated, through what means, and what trade-\\noffs or commitments should and can be quantified. While the\\ndegree of certainty achievable varies across different claims\\nand contexts, the idea is to demonstrate that greater degrees\\nof evidence can be provided for claims about AI develop-\\nment than is typically done today to facilitate auditing them.\\n5.8.4. Redress\\nLastly, once the risk has turned into a confirmed incident,\\nit is paramount that the user is aware of the possibility\\nto redress, preserving his/her trust when adverse or unfair\\nimpact takes place [3]. Redress is related to the concept of\\nalgorithmic recourse [116], and consists of a procedure to\\ncorrect or reverse an AI system outcome that is considered\\nwrong. A key to trustworthy AI is ensuring adequate redress\\nagainst decisions made by AI systems and by humans op-\\nerating them through accessible mechanisms to their users\\nwhen these fail, without forgetting vulnerable persons or\\ncollectives. Redress mechanisms are to be ensured, and com-\\nplemented with accountability frameworks and disclaimers,\\nsince certification will obey particular application domains,\\nand cannot replace responsibility. Machine unlearning [126],\\ncounterfactual explanations [127] or the analysis of disparate\\nimpacts [128] can be also regarded as techniques that can\\nsupport redress in AI-based systems.\\n6. Trustworthy Artificial Intelligence from\\ntheory to practice and regulation:\\nresponsible Artificial Intelligence systems\\nSo far we have exposed the vision of trustworthy AI that\\nhas been tackled in most of the literature: from a theoretical\\npoint of view, and mainly based on principles and recom-\\nmendations. In this section we highlight the importance of\\ntackling trustworthy AI from a practical perspective. A clear\\nmapping from trustworthy AI principles and requirements\\ninto operative protocols that can be automated, verified and\\naudited does not always exist. To achieve this, the field\\nneeds blueprints and standard models to be adopted and\\nstandardized. In what follows we stress on the utmost impor-\\ntance of having practical regulatory scenarios (regulatory\\nsandboxes) and the final output of processes implementing\\nHRAIs using trustworthy AI: a responsible AI system.\\nAccording to this idea, the section is organized as fol-\\nlows. First, Subsection 6.1 defines the nuanced yet necessary\\nnotion of responsible AI systems, to comply with both\\ntrustworthy AI requirements and the law in force. Then,\\nSubsection 6.2 describes the technical requirements that the\\nimplementation of HRAIs will legally require in practice.\\nThen, Subsection 6.3 presents how these requirements are\\ngoing to be evaluated by regulators and auditors through\\nregulatory sandboxes. Subsection 6.4 examines whether all\\nthese steps can be connected and applied through a blueprint\\nproposal to implement trustworthy AI in healthcare. Finally,\\nSubsection 6.5 examines the implications of new HRAIS and\\nemerging AI systems, justifying the necessity of a dynamic\\nregulation and flexible evaluation protocols to deal with new\\nhigh-risk scenarios supported by these systems.\\n6.1. Responsible Artificial Intelligence systems\\nA little prior to trustworthy AI is the term responsible\\nAI, which has been widely used quite as a synonym. How-\\never, it is necessary to make an explicit statement on the\\nsimilarities and differences that can be established between\\ntrustworthy and responsible AI. The main aspects that make\\nsuch concepts differ from each other is that responsible\\nAI emphasizes the ethical use of an AI-based system, its\\nauditability, accountability, and liability.\\nIn general, when referring to responsibility over a cer-\\ntain task, the person in charge of the task assumes the\\nconsequences of his/her actions/decisions to undertake the\\ntask, whether they result to be eventually right or wrong.\\nWhen translating this concept of responsibility to AI-based\\nsystems, decisions issued by the system in question must be\\naccountable, legally compliant, and ethical. Other require-\\nments for trustworthy AI reviewed in this manuscript (such\\nas robustness or sustainability) are not relevant to responsi-\\nbility. Therefore, trustworthy AI provides a broader umbrella\\nthat contains responsible AI and extends it towards consid-\\nering other requirements that contribute to the generation of\\ntrust in the system. It is also worth mentioning that providing\\nresponsibility over AI products links to the provision of\\nmechanisms for algorithmic auditing (auditability), which\\nis part of requirement 7 (Accountability, Subsection 5.8).\\nStressing on the importance of a responsible development\\nof AI, we now define the responsibility associated to AI\\nsystems, following the discussed features.\\nDefinition. A Responsible AI system requires ensuring au-\\nditability and accountability during its design, development\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 18 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nand use, according to specifications and the applicable reg-\\nulation of the domain of practice in which the AI system is\\nto be used.\\nIn the following we discuss in depth these features:\\n1. Auditability: As an element to aid accountability, a thor-\\nough auditing process aims to validate the conformity\\nof the AI-based asset under target to 1) vertical or sec-\\ntorial regulatory constraints; 2) horizontal or AI-wide\\nregulations (e.g., EU AI Act); and 3) specifications and\\nconstraints imposed by the application for which it is\\ndesigned. It is important to note that auditability refers\\nto a property sought for the AI-based system, which\\nmay require transparency (e.g. explainability methods,\\ntraceability), measures to guarantee technical robustness,\\netc. This being said, the auditability of a responsible\\nAI system may not necessarily cover all requirements\\nfor trustworthy AI, but rather those foretold by ethics,\\nregulation, specifications and protocol testing adapted to\\nthe application sector (i.e., vertical regulation).\\n2. Accountability: which establishes the liability of deci-\\nsions derived from the AI system’s output, once its com-\\npliance with the regulations, guidelines and specifica-\\ntions imposed by the application for which it is designed\\nhas been audited. Again, accountability may comprise\\ndifferent levels of compliance with the requirements for\\ntrustworthy AI defined previously.\\nIn the context of the European approach and AI Act,\\nthis translates into a required pre-market use of regulatory\\nsandboxes, and the adaptability of the requirements and reg-\\nulation for trustworthy AI into a framework for the domain\\nof practice of the AI system.\\n6.2. Artificial Intelligence systems’ compliance\\nwith regulation in high-risk scenarios\\nIt has been concluded in the previous section that the\\nconformity of requirements are central for the definition of\\nresponsible AI systems. In Europe, regulatory requirements\\nin force for the deployment of AI-based systems are pre-\\nscribed based on the risk of such systems to cause harm.\\nIndeed, the AI Act agreed by the European Parliament, the\\nCouncil of the European Union, and the European Com-\\nmission, is foreseen to set a landmark piece of legislation\\ngoverning the use of AI in Europe and regulating this tech-\\nnology based on the definition of different levels of risks:\\nminimal, limited and HRAIs. In these categories different\\nrequirements for trustworthy AI and levels of compliance\\nare established, so that regulatory obligations are enforced\\ntherefrom.\\nFurthermore, the European Commission has also asked\\nthe European Committee for Standardization (CEN), the\\nEuropean Committee for Electrotechnical Standardization\\n(CENELEC) and the European Telecommunications Stan-\\ndards Institute (ETSI) to develop standards aimed to cover\\ndifferent practical aspects of AI systems, including foun-\\ndational AI standards, data standards related to AI, Big\\nData and analytics, use cases and applications, governance\\nimplications of AI and computational approaches of AI.\\nEthical, societal concerns and AI trustworthiness also appear\\namong the prioritized areas of these standardization bodies.\\nAmong these defined levels of risk associated to AI-\\nbased systems, those at the top of this classification (HRAIs)\\nare subject to stringent obligations. HRAIs are demanded\\nto comply with the AI Act through the fulfillment of the\\nfollowing seven requirements (AI Act, Chapter 2 [4]):\\n1. Adequate risk assessment and mitigation systems (Art. 9\\n- Risk management system).\\n2. High quality of the datasets feeding the system to mini-\\nmize risks and discriminatory outcomes (Art. 10 - Data\\nand data governance; Art. 9 - Risk management system).\\n3. Logging of activity to ensure traceability of results (Art.\\n12 - Record Keeping; 20 - Automatically generated logs).\\n4. Detailed documentation providing all information nec-\\nessary on the system and its purpose for authorities to\\nassess its compliance (Art. 11 - Technical documentation;\\nArt. 12 - Record-keeping).\\n5. Clear and adequate information to the user (Art. 13 -\\nTransparency).\\n6. Appropriate human oversight measures to minimise risk\\n(Art. 14 - Human oversight).\\n7. High level of robustness, security and accuracy (Art. 15\\n- Accuracy, robustness and cybersecurity).\\nHRAIs must undergo conformity assessments before\\nentering the EU market. One of the most complete guidance\\nprocedures assisting on complying with AI Act regulation is\\nthe CapAI procedure for conducting conformity assessment\\nof AI systems [129]. It describes notions and metrics, check-\\nlists and other procedures to comply with the new legislation.\\nSince the AI Act imposes obligations on providers, im-\\nporters, distributors, and users, the latter can be deemed as\\nproviders in certain cases. For instance, if a user slightly\\nmodifies or uses a ready-made AI-based product such as\\nChatGPT for a different purpose, this makes him/her become\\nresponsible and accountable for the system’s consequences,\\ndepending on the conditions that define HRAIs. This is why\\nin order to realize trustworthy AI that is compliant with\\nthe law, we advocate for the development of responsible\\nAI systems, i.e., systems that not only make a responsible\\nimplementation that fulfills the requirements for trustworthy\\nAI, but also comply with the AI regulation.\\nIn practice, HRAIs providers must work to make their\\nassets meet these requirements, including post-market mon-\\nitoring plans [122] (AI Act Art. 61 - Post-market monitoring\\nby providers and post-market monitoring plan for high-risk\\nAI systems) to document the performance throughout the\\nsystem’s lifetime, in a way that vague concepts become veri-\\nfiable criteria that strengthen the assessment safeguards and\\ninternal checks. Likewise, conformity assessments (AI Act,\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 19 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nArt. 19 and Art. 43) will be ruled by internationally harmo-\\nnized testing principles, guaranteeing high-quality testing.\\nThese tests can depart from ad-hoc procedures and protocols\\nfor the domain at hand. This is the case of the German stan-\\ndardization roadmap on AI [11], which proposes conformity\\nassessments based on several steps: calibration, inspection,\\naudit, validation and verification.\\nThis need for harmonized testing protocols, monitoring\\nplans and conformity assessment procedures is the main\\nreason for the emergence of the concept of AI regulatory\\nsandboxes, which are next detailed and discussed.\\n6.3. Artificial Intelligence regulatory sandboxes: A\\nchallenge for auditing algorithms\\nOnce requirements for HRAIs have been established, the\\nremaining challenge is to make the AI system comply with\\nthem appropriately. Such requisites (AI Act, Chapter 2, Art.\\n8-15) motivate the need for a test environment where to\\naudit AI-based systems by safe and harmonized procedures\\nestablished by the latter. Regulatory sandboxes are indeed\\nrecommended by the AI Act (Chapter 5, Art. 53-54). Con-\\ncretely, the AI Act establishes that algorithms should comply\\nwith regulation and can be tested in a safe environment\\nprior to entering the market. This auditing process can be\\nimplemented via regulatory sandboxes.\\nIn order to successfully undertake AI auditing processes\\nunder the new regulation, industry, academia and govern-\\nmental actors are forced to adapt their processes and teams\\nto comply with the law. Regulatory sandboxes act as test beds\\nand safe playgrounds that allow assessing the compliance of\\nAI systems with respect to regulation, risk mitigation strate-\\ngies, conformity assessments, accountability and auditing\\nprocesses established by the law. Figure 4 illustrates the two\\nstages where sandboxes play a crucial role: i) pre-market\\nauditability and conformity check, and ii) post-market mon-\\nitoring and accountability. The figure illustrates not only the\\ndifferent stakeholders participating in these two stages, but\\nalso the articles in the AI Act where each step within the\\nprocess is described.\\nIn the current context of rapidly evolving AI products,\\nsandboxes allow market stakeholders and business players to\\nexplore and experiment with new and innovative products,\\nservices or businesses under the supervision of a regulator.\\nHowever, the idea of resorting to a sandbox to explore,\\nevaluate and gauge complex technology is not new, nor ex-\\nclusive of AI systems. They have already been used in other\\ncontexts to test and validate Fintech [130] or Blockchain16\\ntechnologies in the European Union. The objective of these\\ncontrolled environments is to test innovative technologies for\\na limited time, for innovators and regulators to cooperate17.\\nThe AI Act also contains measures with the aim to reduce\\nthe regulatory burden on Small and Medium Enterprises\\n16Launch of the European Blockchain Regulatory Sandbox. https:\\n//digital-strategy.ec.europa.eu/en/news/launch-european-blockchai\\nn-regulatory-sandbox, accessed on April 25th, 2023.\\n17First regulatory sandbox on Artificial Intelligence presented: https:\\n//digital-strategy.ec.europa.eu/en/news/first-regulatory-sandbox-art\\nificial-intelligence-presented\\n(SMEs) and startups, prioritize them, and to reduce their\\ntime to market by ensuring legislation can be implemented\\nin two years. The intended goal is to support innovation\\nand small-scale providers, getting apart from the regulation\\nstifling innovation critique.\\nThe benefits of sandboxes is that they support the de-\\nvelopment, testing and validation of innovative AI systems\\nunder the direct supervision and guidance of competent\\nauthorities (AI Act Art. 53). Furthermore, they allow ex-\\nperimenting by derogation (by putting aside certain rules\\nor laws), and experimentation by devolution, which requires\\nbroad supra/national frameworks to establish guidelines that\\nempower and help local governments to establish a regula-\\ntion in a particular area. This enables differences among gov-\\nernment levels by considering local preferences and needs as\\na means to stimulate innovative policies.\\nWhen it comes to the challenges faced by sandboxes,\\nthere is a concern for the lack of proper methodological\\nassessments that are indicative of the possible impact of\\nAI on the society [131]. This concern fosters the need\\nfor cross-border and multi-jurisdictional regulatory sandbox\\nstandardization [132], as well as generic AI standardization\\n[133]. Governments will have to find a balance between EU\\ncoordination and national procedures to avoid conflicts in the\\nimplementation of the regulation [134]. Specifically in the\\nAI Act (Art. 53), participants in the sandbox remain liable\\nunder applicable liability legislation. Eligibility criteria and\\nparticipants obligations and rights is to be set up in imple-\\nmenting acts.\\nDerived from the above challenge, we note that sand-\\nboxes are still far from maturity. This leads to two main\\naspects that remain unresolved: 1) the design of sandboxes\\nwith guidelines that rapidly and effectively permit algorith-\\nmic auditing; and 2) the development of intelligent systems\\nfor high-risk scenarios that are validated through the nec-\\nessary auditing processes. Important efforts are currently\\ndriven towards addressing these aspects as two additional\\nfundamental challenges. At European level, Spain is leading\\na pilot to set up a regulatory sandbox according to the\\nEuropean AI Act legislation.\\nTogether with sandboxes to work in practice, additional\\nfuture mechanisms will include the certification or quality\\ncontrol within a regulatory framework. In this sense, Spain\\nis starting to develop a national seal of quality to certify\\nthe security and quality of AI technology used in Spain.\\nIn cooperation with industry, they will set up the technical\\ncriteria for companies to obtain this seal, and develop tools\\nto facilitate the certification process, e.g., developing self-\\nassessment software. Several companies will be open the\\npossibility to grant the seal, which will be voluntary for AI\\ncompanies to obtain. At the international level, one effort\\ntowards this end is the IEEE CertifAIEd program18 to as-\\nsess ethics of Autonomous Intelligent Systems via certifica-\\ntion guidance, assessment and independent verification. This\\nmark is meant for IEEE authorized assessors and certifiers\\n18IEEE CertifAIEd: https://engagestandards.ieee.org/ieeecertifaie\\nd.html, accessed on June 6th, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 20 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nPost-market monitoring (post-hoc)\\nTitle VIII: POST-MARKET MONITORING, INFORMATION SHARING, MARKET SURVEILLANCE\\n- Chapter 1: Post-market monitoring\\n- Chapter 2: Sharing of information on incidents and malfunctioning\\n- Chapter 3: Enforcement\\nTitle X: CONFIDENTIALITY AND PENALTIES\\nTime\\nPre-market monitoring (ex ante)\\nTitle III: HIGH-RISK AI SYSTEMS\\n- Chapter 3: Obligations of Users of High-Risk AI Systems and other Parties\\n- Chapter 4: Notifying Authorities and Notified Bodies\\n- Chapter 5: Standards, Conformity Assessment, Certificates, Registration\\nDEPLOYMENT\\nAI-based system\\nAudience\\nModel\\nVertical \\nregulation\\nEthical \\nprinciples\\nTechnical \\ndocumentation\\nSandbox\\nHorizontal \\nregulation\\nCertificates (Art. 44)\\nEU declaration of conformity (Art. 48)\\nCE marking of conformity (Art. 49) \\nDocument retention (Art. 50)\\nRegistration (Art. 51)\\nEU database for stand-alone high-\\nrisk AI systems (Art. 60)\\nTitle VI: Governance\\nChapter 4: European \\nArtificial Intelligence \\nBoard\\nStandards, \\nguidance, \\ngood practices\\n(Art. 58)\\nAUDITABILITY & CONFORMITY\\nMONITORING & ACCOUNTABILITY\\nExpert groups\\nCorrective \\nactions \\n(Art. 21)\\nAuthorities & \\nnotified bodies\\nQuality management\\nsystem (Art. 17)\\nPost-market\\nmonitoring plan \\n(Art. 61)\\nNational \\nsupervisory \\nauthority\\nAccess to data and \\ndocumentation\\n(Art. 64)\\nReporting of serious \\nincidents and of \\nmalfunctioning\\n(Art. 62)\\nEuropean \\nCommission\\nProcedure for dealing with AI \\nsystems presenting a risk at \\nnational level (Art. 65)\\nAI-based system\\nAudience \\n(end-users)\\nInteraction, \\npersonal data, …\\nControl & \\nconfiguration\\nLogs\\nUsage statistics\\nPerformance metrics\\nData collection\\nUnion safeguard \\nprocedure (Art. 66)\\nPenalties (Art. 71)\\nMarket surveillance and \\ncontrol of AI systems in the \\nUnion market (Art. 63)\\nTechnical \\ndocumentation\\n(Art. 11)\\nRecord-keeping\\n(Art. 12)\\nSubcontractors of \\nnotified bodies (Art. 34)\\nFigure 4: Diagram showing the role of sandboxes before (ex-ante) and after (post-hoc) the AI-based system has been deployed\\nin the market. Sandboxes permit to evaluate the conformity of the AI-based system w.r.t. technical specifications, horizontal\\n& vertical regulation, and ethical principles in a controlled and reliable testing environment. Once conformity has been verified,\\nsandboxes can be used to interface with the deployed AI-based asset via the established monitoring plan, so that information\\nabout its post-market functioning can be collected and processed. This information is used by the national supervisory authority\\nto evaluate the compliance: if needed, the authority asks for corrective actions and/or reports serious incidents/a continued lack\\nof compliance to the European Commission. Articles in the AI Act related to each step are cited in the diagram.\\nto perform an independent review and verification to grant\\na mark and certificate based on ontological specifications\\nfor Ethical Privacy, Algorithmic Bias, Transparency, and\\nAccountability.\\nWe expect that the first experiences and results of run-\\nning regulatory sandboxes and their alignment with certi-\\nfication activities will permit to learn lessons, to improve\\nAI systems and eventually, to support the progressive pro-\\nliferation of responsible AI systems deployed in practical\\nscenarios. We believe that sandbox assessment should be\\nperiodically performed by independent and impartial assess-\\nment bodies to certificate and audit AI systems during their\\nlifetime.\\n6.4. Practical case study in Artificial Intelligence\\nfor healthcare\\nAt the time of writing (April 2023), the AI Act regulation\\ndraft is constantly being updated through different amend-\\nments, due in part to new versions of AI products accessible\\nto the general public. Concerned with how essential is the\\ntranslation of principles and regulation into specific pro-\\ncesses, it becomes necessary to have blueprint models and\\nprotocols that serve to assess how trustworthy AI systems\\nare.\\nThe blueprint for Trustworthy AI Implementation Guid-\\nance and Assurance for Healthcare is one step taken in this\\ndirection. Figure 5 shows the proposal by the coalition for\\nhealth AI [135], based on collaboration, guiding principles\\nand leadership actions. It is aligned with the AI risk man-\\nagement framework from the National Institute of Standards\\nand Technology (NIST).\\nIn particular, to ensure trustworthiness this framework\\ndescribes four key functions [135]:\\n• Mapping the framing AI risks,\\n• Measuring quantitatively and qualitatively these risks and\\nimpacts,\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 21 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n• Collaboration\\n• Guiding principles\\n• Leadership\\nIntegrated Data \\nInfrastructure for \\nDiscovery, Evaluation and \\nAssurance\\nHealth System \\nPreparedness & \\nAssessment\\nAI Tools, \\nTrustworthiness and \\nTransparency across \\nthe Lifecycle\\nFigure 5: The Coalition for Health AI (CHAI) Blueprint\\nfor Trustworthy AI implementation guidance and assurance\\n[135]. It considers obligations on reliability and testability,\\ntransparency, mitigation of biases particular to the domain,\\nprivacy, security and resilience, among other aspects.\\n• Managing the allocation of risk resources, and a cross-\\ncutting,\\n• Governance via risk management.\\nBased on these functions, they define values to set the\\nkey elements of trustworthy AI in healthcare [135]:\\n1. Useful: valid and reliable, testable, usable and beneficial.\\nThese values are closely linked to social wellbeing (Re-\\nquirement 6, Subsection 5.7) and auditability (Require-\\nment 7, Subsection 5.8).\\n2. Safe, which is related to technical robustness and safety\\n(Requirement 2, Subsection 5.3).\\n3. Accountable and transparent, with clear connections\\nto accountability (Requirement 7, Subsection 5.8) and\\ntransparency (Requirement 4, Subsection 5.5).\\n4. Explainable and interpretable, echoing several dimen-\\nsions of the transparency requirement.\\n5. Fair with harmful bias managed (systemic bias, com-\\nputational and statistical biases and human-cognitive bi-\\nases). The consideration of fairness and the focus on\\nthe management of consequences of harmful biases are\\npresent in requirement 5 (Diversity, non-discrimination\\nand fairness, Subsection 5.6), particularly in the avoid-\\nance of unfair biases. Furthermore, requirement 7 (ac-\\ncountability, Subsection 5.8) also regards the minimiza-\\ntion of negative impacts, either due to harmful biases or\\nto other consequences.\\n6. Secure and resilient, which relate to the dimension of\\ntraceability (requirement 4, Subsection 5.5) and technical\\nrobustness (Requirement 2, Subsection 5.3).\\n7. Privacy-enhanced, which is coupled with requirement 3\\n– Privacy and data governance (Subsection 5.4).\\nIt is important to underscore that some dimensions of\\nthe requirements for trustworthy AI discussed in Section\\n5 are not reflected (at least, explicitly) in the above list of\\nvalues, e.g. environmental wellbeing or reproducibility. This\\nresounds with our formulated definition of a responsible\\nAI system, showing that a system as such, depending on\\nits domain of application, may require different degrees of\\ncompliance with the requirements for trustworthiness.\\nThe blueprint analyzed in [135] recognizes the difficul-\\nties on building ecosystems when multiple guidelines are left\\nout in the wild without a standardization consensus. It calls\\nfor mapping socio-technical scenarios to resolve tensions\\namong principles, an ISO-based approach to professional\\nresponsibility, and institutionalizing trustworthy AI Systems\\n(that is, responsible AI systems).\\nAs a follow-up of the blueprint, the coalition for health\\nAI [135] suggests:\\n• Setting up an engaged assurance lab and advisory service\\nindependent infrastructure.\\n• Institutionalizing trustworthy AI systems (responsible AI\\nsystems).\\n• Promoting a Coalition of the Willing through interesting\\nstrategies that can be applied in health AI to drive a\\npositive change.\\nOther practical frameworks exist. They count with strate-\\ngies to implement ethics and the governance of AI systems in\\nhealth to separate the factors affecting trustworthy medical\\nAI into design (data and algorithm aspects) and application.\\nThis is done through controls strategies [136] at both design\\nand application phases. First, the ethical governance system\\ndeparts from social needs and ethical values, which lead\\nto ethical principles to be enforced at the research stage.\\nAfter that, those principles guide the ethical norms that allow\\nperforming risk assessment, and later make the law and\\nregulation concrete. In particular, the framework in [137]\\naims at avoiding situations that can have dire consequences\\nfor patients. For instance, integrating the applied ethics Z-\\nInspection® [138] process to map and assess tensions in\\nsocio-technical scenarios in trustworthy AI. Another pro-\\nposal [139] formulates ten commandments (i.e., high-level\\nethical principles) that should be met by medical AI-based\\nsystems. Such commandments are formulated without the\\ntheoretical aspects underneath for the sake of an easier\\nunderstanding and verification of all stakeholders involved\\nin this domain. European fundamental rights also provide\\nlegal and ethical guidelines for the adoption, development\\nand application of medical AI [140]. These strategies and the\\nblueprint are advancing the research in the area, and results\\nwill be extensible to other domains to attain trustworthy AI.\\nDespite the clear establishment of the seven require-\\nments for HRAIs within the AI Act described in Section\\n6.2, the particular implementation steps to be taken within a\\nparticular area of application often remain under-specified.\\nIt becomes evident that the AI-based system is stringently\\ndependent on the sector of application, as well as on the\\ncoupling of the requirements for trustworthy AI to justify\\nexisting regulations and standards. Therefore, for a given\\ndomain of practice, an overarching consideration of the\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 22 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\ncomplete scenario is needed from the Trustworthy AI prac-\\ntical point of view. The field needs to further specify legal\\nrequirements, risk assessment tools for the ethical impact\\nof the AI system, data privacy and data governance mod-\\nels, ad-hoc risk management systems and conformity as-\\nsessments, and rest of essential elements evaluated in the\\nregulatory sandboxes testing the scenario. This may also\\nspur the emergence of generic trustworthy AI frameworks\\n(regulatory sandboxes) that can be potentially adapted to\\ndifferent domains, as it is discussed in [141].\\n6.5. Urgent needs for emerging AI systems,\\ndynamic regulation, and evaluation protocols\\nThe widespread use and repercussion of the achieve-\\nments of emerging AI systems, such as GPAIS or neuro-\\nscience technology, have brought to the public arena the\\npotentials and implications of new high-risk scenarios sup-\\nported by these technological advances. In this section we\\ndiscuss potential issues to be tackled to regulate new HRAIs\\nas well as future emerging AI systems. We discuss and argue\\nthat regulation should be dynamic and malleable to establish\\nthe boundaries of new high-risk scenarios supported by\\ntechnological AI advances. Likewise, we also highlight the\\nneed for flexible evaluation procedures that can be adapted\\nin an agile way to cope with the fast evolution of AI systems.\\nIndeed, the rapid pace at which AI evolves over time can\\nunexpectedly give rise to new high-risk scenarios beyond\\nthose defined by regulation, such as the AI Act (Section\\n6.2). This requires regulatory protocols to cope with new\\nemerging applications. In the case of the European AI Act,\\non 11th May 2023, MEPs endorsed new transparency and\\nrisk-management rules for AI systems19. MEPs expanded\\nthe classification of high-risk areas to include those that\\ncould compromise or harm people’s health, safety, funda-\\nmental rights or the environment. Such revised classification\\nof high-risk scenarios also considered AI systems used to\\ninfluence voters in political campaigns, as well as recom-\\nmender systems (with more than 45 million users) utilized\\nby social media platforms. Intrusive and discriminatory uses\\nof AI-based biometric systems have been also identified as\\nprohibited AI systems, such as:\\n• “Real-time” remote biometric identification systems in\\npublicly accessible spaces;\\n• “Post” remote biometric identification systems, with the\\nonly exception of law enforcement for the prosecution of\\nserious crimes and only after judicial authorization;\\n• Biometric categorization systems using sensitive charac-\\nteristics (e.g. gender, race, ethnicity, citizenship status,\\nreligion, political orientation);\\n• Predictive policing systems (based on profiling, location\\nor past criminal behavior);\\n19AI Act: a step closer to the first rules on Artificial Intelligence, https:\\n//www.europarl.europa.eu/news/en/press-room/20230505IPR84904/ai-act-a\\n-step-closer-to-the-first-rules-on-artificial-intelligence, accessed\\non June 6th, 2023\\n• Emotion recognition systems in law enforcement, border\\nmanagement, workplace, and educational institutions; and\\n• Indiscriminate scraping of biometric data from social me-\\ndia or CCTV footage to create facial recognition databases\\n(violating human rights and right to privacy).\\nIn this revision of the AI Act, MEPs have also proposed\\ntailored regulatory regimes for new and fast-evolving devel-\\nopments in the field of AI and GPAIS. Since GPAIS are\\nsystems that have a wide range of possible uses without\\nsubstantial modification and fine-tuning, generative foun-\\ndation models are examples of rapidly evolving areas for\\nwhich, if regulation is not set in place, consequences may\\nbe hard to revert. Such systems must guarantee robust the\\nprotection of fundamental rights, health and safety and the\\nenvironment, democracy and rule of law. To this end, such\\nemerging AI systems must assess and mitigate risks, comply\\nwith design, information and environmental requirements,\\nand be registered in the EU database. Furthermore, addi-\\ntional transparency requirements have been demanded for\\ngenerative foundation models such as GPT: they must inform\\nthat the content is generated by an AI model, the model\\nmust be designed to avoid generating illegal content and\\npublishing summaries or copyrighted content used during\\ntraining. Jurisdiction at national level will also need to be\\nadapted to different considerations demanded by different\\nsectors, e.g., the public sector or labor sector.\\nAnother area in which regulation and technology yet\\nhave to advance is in copyright management of generated\\nartwork produced by fundation models. Although the AI\\nAct requires to disclose the use of copyrighted material in\\nthe training data, there is no current way to detect when AI\\ngenerated content may be directly related to existing content\\nprotected by copyright, nor it is clear who owns the intellec-\\ntual property of generative models outputs [142, 143].\\nBesides GPAIS, other emerging AI-based technologies\\nalso require specialized adjustments of ongoing regulatory\\nefforts. This is the case of neurotechnology, such as brain\\ninterfaces. The needs to handle novel applications never\\nused before become evident by recent research [144] that\\nshows the potential of “mind-reading” [145]. For instance,\\nthe study in [144] shows the potential of leveraging lan-\\nguage models as an autoregressive prior to generate novel\\nsequences that can decode structured sequential information\\nin the form of text from brain signals. Although the study of\\nhuman imagination decoding shows human cooperation is\\nrequired for the approach to work, this may not be a requisite\\nin the future. Even if decoding is not accurate yet, these\\nsystems could be used maliciously.\\nThese recent results attained by neurotechnology call for\\nraising awareness about the risks posed by brain decoding\\ntechnology, and for the design of regulation and policies to\\npreserve fundamental rights such as mental privacy. A role\\nmodel in this direction is the novel neurorights regulation\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 23 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\npioneered by Chile20. Their neuroprotection agenda (which\\nis closely followed up by Spain) leads the way to the reg-\\nulation of brain technology and legislation of advances in\\nAI-supported medicine and science from a human rights’\\npoint of view. This is of utmost importance to avoid mental\\nmanipulation, as mental diseases are the grand pandemic\\nof XXI century. If used appropriately, AI based technology\\napplied to the diagnosis and treatment of mental diseases has\\na great potential to improve patients’ quality of life.\\nIn order for regulation to evolve synchronously with\\ntechnological advances (and vice versa), evaluation proce-\\ndures must be flexible and dynamically adaptable to accom-\\nmodate the implications of these advances over time. More\\ncomprehensive evaluation tools are required to be adopted\\nby the AI community of practitioners and users if we aspire\\nto synergistic solutions that can complement governmental\\nefforts. In particular, ML and AI model evaluation is nor-\\nmally based on leader-board benchmarks that do not always\\nreflect reality, and may have a detrimental effect when they\\nare not faithful to reality. A more accountable evaluation\\nshould consider aggregated metrics. Improperly constructed\\nbenchmarks may, for instance, reflect unrealistic overesti-\\nmation of the capabilities of a model when predicting over\\nminority classes. This may lead to hazards that end up\\nobfuscating the real benefits of AI.\\nTo avoid these issues, important guidelines for robust\\nevaluation practices [146] include:\\n1. Granular performance reporting protocols with break-\\ndowns across the features that have demonstrated affect-\\ning performance.\\n2. Designing benchmarks to test capabilities and to signif-\\nicantly vary on important features of the problem space\\nand labeling instances to allow for granular analyses (e.g.\\nas the Holistic Evaluation of Language Models bench-\\nmark [147]).\\n3. Record all results, successful or failing, partial or fully\\ntraced, in supplementary material or public repositories\\nfor each run and validation split separately (e.g., in med-\\nical AI [148]).\\n4. Enable researchers follow-up instance-level analyses by\\nincluding data labels and annotations of those instances.\\nHowever, prospective evaluation methodologies as the\\none described above should be versatile and extensible to\\nembrace and incorporate new performance metrics, evalu-\\nation protocols or even modeling tasks proposed along the\\nyears. But most importantly: their sought flexibility should\\nnot give rise to exceptional cases that would undermine the\\nvalidity and applicability of regulations in force.\\nWe conclude that given the fast pace at which AI is\\nprogressing in the last months, it is of paramount importance\\nto have a dynamic regulation from a double perspective:\\n20NeuroRights Foundation, https://neurorightsfoundation.org/,acces\\nsedonJune06th,2023., which has taken a step towards the first Neurorights\\nlaw in this country\\nthe appearance of risk-based scenarios and the emergence\\nof novel AI systems. Only in this way the regulation will\\nfacilitate the realization of responsible AI systems, in par-\\nallel to the development of methodologies for algorithmic\\nauditing and the clearance of responsibilities in the use of\\nsuch systems.\\n7. From the Artificial Intelligence moratorium\\nletter to regulation as the key for consensus\\nAt the time of writing, a global debate is held around\\nthe moratorium letter published by several renowned re-\\nsearchers calling for a pause in large AI experimentation21.\\nThe letter can be interpreted as a contribution to pointing\\nout the gap between the fast advance of high-powered AI\\nsystems and the regulation. The letter also highlights that:\\n“AI research and development should be refo-\\ncused on making today’s powerful, state-of-the-\\nart systems more accurate, safe, interpretable,\\ntransparent, robust, aligned, trustworthy, and\\nloyal.”\\nFollowing up this moratorium letter, several declarations\\nand written statements by reputed experts have been pub-\\nlished to approach the AI conundrum between ethics, regula-\\ntion and technological progress from different perspectives.\\nAmong them, we highlight the interview with G. Hinton22,\\nin which he states that “We need to find a way to control\\nartificial intelligence before it’s too late”. Interestingly un-\\nder the scope of this work, he has also underscored the\\ndifferent nature of intelligent systems when compared to\\nhuman intelligence, and thereby the need for establishing\\nregulation for these artificial systems:\\n“Our brains are the result of evolution and\\nhave a series of integrated goals — such as not\\nhurting the body, hence the notion of damage;\\neating enough, hence the notion of hunger. Mak-\\ning as many copies of ourselves as possible,\\nhence the sexual desire. Synthetic intelligence,\\non the other hand, hasn’t evolved: we’ve built\\nit. Therefore, it doesn’t necessarily come with\\ninnate goals. So, the big question is, can we\\nmake sure that AI has goals that benefit us?\\nThis is the so-called alignment problem. And we\\nhave several reasons to be very concerned.”\\nA similar line of thinking has been expressed by Harari23,\\nemphasizing on the pressing immediacy at which regulation\\n21Future of Life Institute, Pause giant AI experiments: An open letter,\\nhttps://futureoflife.org/open-letter/pause-giant-ai-experiments/,\\naccessed on April 25th, 2023\\n22Geoffrey Hinton: \\\"We need to find a way to control artificial intelli-\\ngence before it’s too late\\\", https://english.elpais.com/science-tech/2023\\n-05-12/geoffrey-hinton-we-need-to-find-a-way-to-control-artificial-i\\nntelligence-before-its-too-late.html, accessed on June 4th, 2023\\n23Yuval Noah Harari argues that AI has hacked the operating system of\\nhuman civilisation, https://www.economist.com/by-invitation/2023/04/28\\n/yuval-noah-harari-argues-that-ai-has-hacked-the-operating-system-o\\nf-human-civilisation, accessed on June 4th, 2023.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 24 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nis needed to match the speed of AI technological develop-\\nment with the public use of AI systems:\\n“We can still regulate the new AI tools, but\\nwe must act quickly. Whereas nukes cannot\\ninvent more powerful nukes, AI can make ex-\\nponentially more powerful AI. The first crucial\\nstep is to demand rigorous safety checks before\\npowerful AI tools are released into the pub-\\nlic domain. Just as a pharmaceutical company\\ncannot release new drugs before testing both\\ntheir short-term and long-term side-effects, so\\ntech companies shouldn’t release new AI tools\\nbefore they are made safe. We need an equiv-\\nalent of the Food and Drug Administration for\\nnew technology, and we need it yesterday.”\\nAnother example is the proposal made by Sam Altman\\n(OpenAI co-founder) before the US Senate to defend the\\nbenefits of this revolutionary technology, claiming that AI\\nregulation should ensure that the public has access to its\\nmany advantages24:\\n“[...] we are not alone in developing this tech-\\nnology. It will be important for policymakers\\nto consider how to implement licensing regula-\\ntions on a global scale and ensure international\\ncooperation on AI safety, including examining\\npotential intergovernmental oversight mecha-\\nnisms and standard-setting.”\\nFinally, the manifesto on AI risk supported by multiple\\nscientists and notable figures in the AI landscape has claimed\\nto elevate the mitigation of AI risks to the priority levels of\\nother humanity-threatening fatalities25:\\n“Mitigating the risk of extinction from AI should\\nbe a global priority alongside other societal-\\nscale risks such as pandemics and nuclear\\nwar.”\\nThe controversy held around these and other statements\\nis whether humanity is close to or far from the moment at\\nwhich AI can pose a realistic threat to its own existence. Un-\\nquestionably, triggering this debate has ignited even further\\nthe need for ethical and regulatory frameworks that regulate\\nwhether and how AI-based systems can be trusted and used\\nin practical setups.\\nAfter this latter manifesto, the Center for AI Safety\\nhas just published a new document entitled Existing policy\\nproposals aimed at present and future harms’26. The aim of\\nthis one-page document is to describe three proposals that, in\\nour view, promote AI safety. We follow with a short analysis\\nof these proposals:\\n24Written Testimony of Sam Altman Chief Executive Officer OpenAI,\\nhttps://www.washingtonpost.com/documents/0668f6f4-d957-4b94-a745-2aa\\n9617d1d60.pdf?itid=lk_inline_manual_18, accessed on June 4th, 2023.\\n25Statement on AI Risk, https://www.safe.ai/statement-on-ai-risk,\\naccessed on June 04th, 2023.\\n26Existing Policy Proposals Targeting Present and Future Harms, https:\\n//https://www.safe.ai/post/three-policy-proposals-for-ai-safety,\\naccessed on June 07th, 2023.\\n• Legal Liability for AI harms: The first issue highlighted in\\nthe document is the need for establishing improved legal\\nliability frameworks for the accountability of damages\\ncaused by the application of AI systems. GPAIS are also\\nreferred in the document for the ill-advised implemen-\\ntation of legal exemptions to absolve GPAIS developers\\nof liability, as such exemptions could unfairly shift the\\nentire burden of responsibility from large corporations\\nto smaller actors, users and communities lacking the\\nnecessary resources, access, and capabilities to effectively\\naddress and alleviate all risks.\\n• Increased regulatory scrutiny: The second problem em-\\nphasized in this document is the need for a greater regu-\\nlatory inspection during the development of AI systems,\\nextending beyond the application layer to encompass the\\nentire product lifecycle. It underscores the importance of\\nholding companies responsible for the data and design\\nchoices they make when developing these models. In line\\nwith this proposal, increased transparency and regula-\\ntions over training data are crucial to address algorithmic\\nbias effectively, and to prevent companies from unfairly\\nleveraging copyrighted materials through data modeling\\nwithout compensating their creators.\\n• Human supervision of automated systems: The third\\ntheme in the document is the importance of human over-\\nsight in the implementation of HRAIs. Human oversight\\ncan contribute to lessening potential concerns with bias\\nand the propagation of false or misleading information\\nthrough AI systems. An explicit reference is done to the\\nEU’s regulatory proposal, with a positive emphasis on the\\nimportance therein granted to the human oversight in the\\ndeployment of HRAIs.\\nOur position, as we put it in this manuscript, is that\\n\\\"regulation is a key for consensus\\\" among these diverging\\nvoices to cast light over the shadows of modern AI technolo-\\ngies. For this to occur, technologies, methodologies and tools\\nsupporting the development, auditability and accountability\\nof responsible AI systems are of utmost importance to cope\\nwith high-risk scenarios and to meet regulatory constraints.\\nTo finish this section, we pay attention to a final point\\nmade by the authors of the paper [15]. Unfortunately, this\\nroad towards consensus is not exempt of their own risks.\\nIndeed, conflating trust and trustworthiness with the accept-\\nability of risks blurs the distinction between acceptability\\njudgments made by domain experts and the trustworthiness\\nof AI systems implemented in society [15]. It has been\\nargued that trust is improbable to be produced on demand\\nand impossible on command, as “trust engineering” may\\nbackfire and not achieve its goal. Focused on trust and\\ntrustworthiness in AI in the public sector, [15] argues on\\nthe four acute challenges facing the European Commission’s\\nattempt to signal the trustworthiness of AI through its pro-\\nposed regulation: the uncertainty about the antecedents of\\nperceived trust in public institutions that utilize AI; the threat\\nof misalignment between trustworthiness and degrees of\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 25 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\ntrust; concealed behavioral factors behind the acceptability\\nof risks; and the need for impartial intermediaries.\\nDespite these and other curves in the road, regulation can\\nbe an unquestionable driving force to consolidate and put\\nall these diverging voices on the same page. Regulation has\\nfavored consensus about the benefits and restrictions of tech-\\nnological advances that have evolved faster than expected,\\npermeating quickly into the society (e.g., social networks,\\nInternet or mobile communications). AI should not be an\\nexception. There is still a long way to go before we have\\nfully aligned AI technology and regulation, developing re-\\nsponsible AI systems adapted to each risk scenario and fully\\nleveraging the latest advances in the field. For this to occur,\\nthe European regulatory model based on risk-based use case\\nscenarios can serve as a guiding light for the maturity and\\nimplementation of ethical, legal and technical frameworks,\\nfostering the creation of industrial and institutional instru-\\nments (e.g. AI sandboxes or AI ethics board [149]) that\\nguarantee that AI-based products and services comply with\\ntheir requirements.\\n8. Concluding remarks\\nFor years now, the ever-growing capabilities of AI-\\npowered systems have stimulated debates about the impact,\\nbenefits, implications and risks brought by AI systems to\\nthe industry and society. The ground-breaking potential of\\nlarge generative AI models such as ChatGPT and GPT4 has\\nreinvigorated this debate, since their near general-purpose\\ncapabilities learned from multimodal data can support a\\nwide variety of intended and unintended purposes and\\ntasks, by generating content that is hardly distinguishable\\nfrom that made by humans. This notorious advance has\\nreinvigorated the relevance and momentum of trustworthy\\nAI systems, particularly in what refers to 1) the ethical usage\\nof these models, and 2) the need for regulatory directives that\\nestablish what, when and how AI systems can be adopted in\\npractical applications.\\nIn this context, this manuscript has shed light on the\\nprinciples, pillars and requirements to be met by trustworthy\\nAI systems to be considered as such. To this end, we have de-\\nparted from mature regulation/supervisory frameworks de-\\nveloped around trustworthy AI (e.g. AI Act) to provide clear\\ndefinitions of all related concepts, placing emphasis on what\\neach requirement for trustworthiness in AI stands for, why\\nthey contribute to generating trust in the user of an AI-based\\nsystem, and how such requirements can be met technically.\\nRegarding the latter, a short tour over technological areas\\nthat can contribute to each of these requirements has been\\noffered. Our study has also overviewed ethical principles for\\nthe development of AI, which establish an overarching set\\nof recommendations that ensure that this discipline will be\\nadvanced under social and ethical standards. The study has\\nbeen complemented by a discussion on practical aspects to\\nbe considered in the design, development and use of trust-\\nworthy AI systems, stressing on the importance of assessing\\ntheir conformity to regulations (auditability) and explaining\\nhow their decisions are issued (accountability). These two\\npractical aspects must be met by responsible AI systems.\\nFurther along this line, accountability and explainability\\nhave permeated deeply into the recommendations recently\\nissued for the development of trustworthy medical AI, a risk-\\ncritical sector in large demand for trust when embracing new\\ntechnological advances. Our analysis of such recommenda-\\ntions has exposed that auditability and accountability are at\\nthe core of the guidelines proposed in this area; together\\nwith ethics, data governance and transparency. Medical AI\\nexemplifies the paramount relevance of considering all these\\nrequirements for trustworthiness along the entire AI cycle.\\nFor a given domain of practice, we need to assess the\\ncomplete scenario from the Trustworthy AI practical\\npoint of view, that is, all essential elements audited\\nin regulatory sandboxes for scenario testing, together\\nwith clear accountability protocols. Above all, the\\ndevelopment of responsible AI systems as the final\\noutput of the chain is essential and must be the goal\\nfor current AI designs and developments.\\nIn summary, we hope that this paper serves as a reference\\nfor researchers, practitioners and neophytes who are new to\\nthe world of AI, with interest in trustworthy AI from a holis-\\ntic perspective. A well-rounded analysis of what trust means\\nin AI-based systems and its requirements as the one offered\\nin this manuscript is a key for the design and development\\nof responsible AI systems throughout their life cycle. We\\nshould not regulate scientific progress, but rather products\\nand its usage. As we emphasize in this paper, regulation\\nis the key for consensus, and for this purpose, trustworthy\\nAI and responsible AI systems for high risk scenarios are\\nimperative, as they will contribute to the convergence be-\\ntween technology and regulation, the advance of science,\\nthe prosperity of our economies, and the good of humanity,\\nsubject to legal requirements and ethical principles.\\n9. Acknowledgments\\nN. Díaz-Rodríguez is currently supported by a Marie\\nSkłodowska-Curie Actions (MSCA) Postdoctoral Fellow-\\nship with agreement ID: 101059332 and the Leonardo\\nScholarship for Researchers and Cultural Creators 2022\\nfrom the BBVA Foundation. J. Del Ser has received funding\\nsupport from the Spanish Centro para el Desarrollo Tec-\\nnológico Industrial (CDTI) through the AI4ES project, and\\nfrom the Basque Government (Eusko Jaurlaritza) through\\nthe Consolidated Research Group MATHMODE (IT1456-\\n22). F. Herrera has received funding support from the\\nSpanish Ministry of Science and Innovation (grant PID2020-\\n119478GB-I00).\\nDeclaration of competing interest\\nThe authors declare that they have no known competing\\nfinancial interests or personal relationships that could have\\nappeared to influence the work reported in this paper.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 26 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\nReferences\\n[1] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford,\\nM. Chen, I. Sutskever, Zero-shot text-to-image generation, in: Inter-\\nnational Conference on Machine Learning, PMLR, 2021, pp. 8821–\\n8831.\\n[2] C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton,\\nK. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans,\\nJ. Ho, D. J. Fleet, M. Norouzi, Photorealistic text-to-image diffusion\\nmodels with deep language understanding, in: S. Koyejo, S. Mo-\\nhamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh (Eds.), Advances in\\nNeural Information Processing Systems, Vol. 35, Curran Associates,\\nInc., 2022, pp. 36479–36494.\\nURL https://proceedings.neurips.cc/paper_files/paper/2022/file\\n/ec795aeadae0b7d230fa35cbaf04c041-Paper-Conference.pdf\\n[3] European Commission High-Level Expert Group on AI, Ethics\\nguidelines for trustworthy AI (2019).\\n[4] European Union, Proposal for a Regulation of the European Parlia-\\nment and of the Council Laying down harmonised rules on Artificial\\nIntelligence (Artificial Intelligence Act) and amending certain Union\\nLegislative Acts. COM/2021/206 final (2021).\\n[5] UNESCO, Recommendation on the ethics of artificial intelligence,\\nDigital Library UNESDOC (2020).\\nURL en.unesco.org\\n[6] R. Benjamins, A. Barbado, D. Sierra, Responsible AI by design in\\npractice, in: Proceedings of the Human-Centered AI: Trustworthi-\\nness of AI Models & Data (HAI) track at AAAI Fall Symposium,\\n2019.\\n[7] G. Pisoni, N. Díaz-Rodríguez, H. Gijlers, L. Tonolli, Human-\\ncentered artificial intelligence for designing accessible cultural her-\\nitage, Applied Sciences 11 (2) (2021) 870.\\n[8] B. C. Stahl, D. Wright, Ethics and privacy in AI and big data:\\nImplementing responsible research and innovation, IEEE Security\\n& Privacy 16 (3) (2018) 26–33.\\n[9] M. Coeckelbergh, AI ethics, MIT Press, 2020.\\n[10] M. Coeckelbergh, Artificial intelligence, responsibility attribution,\\nand a relational justification of explainability, Science and engineer-\\ning ethics 26 (4) (2020) 2051–2068.\\n[11] W. Wahlster, C. Winterhalter, German standardization roadmap on\\nartificial intelligence, DIN/DKE, Berlin/Frankfurt (2020) 100.\\n[12] L. Edwards, The EU AI Act: a summary of its significance and scope,\\nAda Lovelace Institute, Expert explainer Report (2022) 26.\\n[13] S. Campos, R. Laurent, A Definition of General-Purpose AI Sys-\\ntems: Mitigating Risks from the Most Generally Capable Models,\\nAvailable at SSRN 4423706 (2023).\\n[14] M. Estévez Almenzar, D. Fernández Llorca, E. Gómez, F. Mar-\\ntinez Plumed, Glossary of human-centric artificial intelligence,\\nTech. Rep. JRC129614, Joint Research Centre (2022).\\n[15] J. Laux, S. Wachter, B. Mittelstadt, Trustworthy artificial intelli-\\ngence and the European Union AI act: On the conflation of trustwor-\\nthiness and acceptability of risk, Regulation & Governance n/a (n/a).\\narXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/rego.125\\n12, doi:https://doi.org/10.1111/rego.12512.\\nURL https://onlinelibrary.wiley.com/doi/abs/10.1111/rego.12512\\n[16] E. Tjoa, C. Guan, A survey on explainable artificial intelligence\\n(XAI): Toward medical XAI, IEEE Transactions on Neural Net-\\nworks and Learning Systems 32 (11) (2020) 4793–4813.\\n[17] D. Doran, S. Schulz, T. R. Besold, What does explainable AI really\\nmean? A new conceptualization of perspectives, arXiv preprint\\narXiv:1710.00794 (2017).\\n[18] Z. C. Lipton, The mythos of model interpretability: In machine learn-\\ning, the concept of interpretability is both important and slippery,\\nQueue 16 (3) (2018) 31–57.\\n[19] European Commission High-Level Expert Group on AI, The Assess-\\nment List for Trustworthy Artificial Intelligence (ALTAI) for self\\nassessment (2020).\\n[20] C. Widmer, M. K. Sarker, S. Nadella, J. Fiechter, I. Juvina,\\nB. Minnery, P. Hitzler, J. Schwartz, M. Raymer, Towards Human-\\nCompatible XAI: Explaining Data Differentials with Concept Induc-\\ntion over Background Knowledge, arXiv preprint arXiv:2209.13710\\n(2022).\\n[21] B. Lepri, N. Oliver, A. Pentland, Ethical machines: the human-\\ncentric use of artificial intelligence, Iscience (2021) 102249.\\n[22] G. Pisoni, N. Díaz-Rodríguez, Responsible and human centric AI-\\nbased insurance advisors, Information Processing & Management\\n60 (3) (2023) 103273.\\n[23] N. Tomašev, J. Cornebise, F. Hutter, S. Mohamed, A. Picciariello,\\nB. Connelly, D. C. Belgrave, D. Ezer, F. C. v. d. Haert, F. Mugisha,\\net al., Ai for social good: unlocking the opportunity for positive\\nimpact, Nature Communications 11 (1) (2020) 2468.\\n[24] A. Holzinger, Interactive machine learning for health informatics:\\nwhen do we need the human-in-the-loop?, Brain Informatics 3 (2)\\n(2016) 119–131.\\n[25] World Economic Forum, Empowering AI leadership an oversight\\ntoolkit for boards of directors, Tech. rep. (2019).\\n[26] World Economic Forum, Empowering AI Leadership: AI C-Suite\\nToolkit , Tech. rep. (2022).\\n[27] E. Cambria, L. Malandri, F. Mercorio, M. Mezzanzanica, N. Nobani,\\nA survey on XAI and natural language explanations, Information\\nProcessing & Management 60 (1) (2023) 103111.\\n[28] L. Floridi, Establishing the rules for building trustworthy AI, Nature\\nMachine Intelligence 1 (6) (2019) 261–262.\\n[29] R. Mariani, F. Rossi, R. Cucchiara, M. Pavone, B. Simkin, A. Koene,\\nJ. Papenbrock, Trustworthy AI – Part 1, Computer 56 (2) (2023) 14–\\n18.\\n[30] P.-Y. Chen, P. Das, AI Maintenance: A Robustness Perspective,\\nComputer 56 (2) (2023) 48–56.\\n[31] K. R. Varshney, Trustworthy machine learning and artificial intelli-\\ngence, XRDS: Crossroads, The ACM Magazine for Students 25 (3)\\n(2019) 26–29.\\n[32] J. Yang, K. Zhou, Y. Li, Z. Liu, Generalized out-of-distribution\\ndetection: A survey, arXiv preprint arXiv:2110.11334 (2021).\\n[33] A. Ruospo, E. Sanchez, L. M. Luza, L. Dilillo, M. Traiola, A. Bosio,\\nA survey on deep learning resilience assessment methodologies,\\nComputer 56 (2) (2023) 57–66.\\n[34] S. Speakman, G. A. Tadesse, C. Cintas, W. Ogallo, T. Akumu,\\nA. Oshingbesan, Detecting systematic deviations in data and models,\\nComputer 56 (2) (2023) 82–92.\\n[35] T. Lesort, V. Lomonaco, A. Stoian, D. Maltoni, D. Filliat, N. Díaz-\\nRodríguez, Continual learning for robotics: Definition, framework,\\nlearning strategies, opportunities and challenges, Information fusion\\n58 (2020) 52–68.\\n[36] M. Abdar, F. Pourpanah, S. Hussain, D. Rezazadegan, L. Liu,\\nM. Ghavamzadeh, P. Fieguth, X. Cao, A. Khosravi, U. R. Acharya,\\net al., A review of uncertainty quantification in deep learning: Tech-\\nniques, applications and challenges, Information Fusion 76 (2021)\\n243–297.\\n[37] J. Parmar, S. Chouhan, V. Raychoudhury, S. Rathore, Open-world\\nmachine learning: applications, challenges, and opportunities, ACM\\nComputing Surveys 55 (10) (2023) 1–37.\\n[38] R. S. Zimmermann, W. Brendel, F. Tramer, N. Carlini, Increasing\\nconfidence in adversarial robustness evaluations, in: A. H. Oh,\\nA. Agarwal, D. Belgrave, K. Cho (Eds.), Advances in Neural Infor-\\nmation Processing Systems, 2022.\\nURL https://openreview.net/forum?id=NkK4i91VWp\\n[39] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schul-\\nman, D. Mané, Concrete problems in AI safety, arXiv preprint\\narXiv:1606.06565 (2016).\\n[40] D. Hendrycks, N. Carlini, J. Schulman, J. Steinhardt, Unsolved\\nproblems in ml safety, arXiv preprint arXiv:2109.13916 (2021).\\n[41] S. Mohseni, H. Wang, C. Xiao, Z. Yu, Z. Wang, J. Yadawa, Tax-\\nonomy of machine learning safety: A survey and primer, ACM\\nComputing Surveys 55 (8) (2022) 1–38.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 27 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n[42] T. Gu, K. Liu, B. Dolan-Gavitt, S. Garg, Badnets: Evaluating back-\\ndooring attacks on deep neural networks, IEEE Access 7 (2019)\\n47230–47244.\\n[43] D. Hendrycks, C. Burns, S. Basart, A. Critch, J. Li, D. Song,\\nJ. Steinhardt, Aligning AI with shared human values, Proceedings\\nof the International Conference on Learning Representations (ICLR)\\n(2021).\\n[44] C. O’neil, Weapons of math destruction: How big data increases\\ninequality and threatens democracy, Crown, 2017.\\n[45] R. B. Parikh, S. Teeple, A. S. Navathe, Addressing bias in artificial\\nintelligence in health care, Jama 322 (24) (2019) 2377–2378.\\n[46] K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman,\\nV. Ivanov, C. Kiddon, J. Konečn`y, S. Mazzocchi, B. McMahan, et al.,\\nTowards federated learning at scale: System design, Proceedings of\\nMachine Learning and Systems 1 (2019) 374–388.\\n[47] N. Rodríguez-Barroso, G. Stipcich, D. Jiménez-López, J. A. Ruiz-\\nMillán, E. Martínez-Cámara, G. González-Seco, M. V. Luzón, M. A.\\nVeganzones, F. Herrera, Federated learning and differential privacy:\\nSoftware tools analysis, the Sherpa.ai FL framework and method-\\nological guidelines for preserving data privacy, Information Fusion\\n64 (2020) 270–292.\\n[48] C. Marcolla, V. Sucasas, M. Manzano, R. Bassoli, F. H. Fitzek,\\nN. Aaraj, Survey on fully homomorphic encryption, theory, and\\napplications, Proceedings of the IEEE 110 (10) (2022) 1572–1609.\\n[49] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov,\\nK. Talwar, L. Zhang, Deep learning with differential privacy, in:\\nProceedings of the 2016 ACM SIGSAC Conference on Computer\\nand Communications Security, 2016, pp. 308–318.\\n[50] Public Voice coalition, Universal Guidelines for Artificial Intelli-\\ngence, https://thepublicvoice.org/ai-universal-guidelines/,\\nonline [accessed April 20th, 2023] (2018).\\n[51] Information Commissioner’s Office (ICO), How to use AI and\\npersonal data appropriately and lawfully, https://ico.org.uk/med\\nia/for-organisations/documents/4022261/how-to-use-ai-and-perso\\nnal-data.pdf, online [accessed April 20th, 2023] (2022).\\n[52] E. Union, Regulation (EU) 2022/868 of the European Parliament\\nand of the Council of 30 May 2022 on European data governance\\nand amending Regulation (EU) 2018/1724 (Data Governance Act)\\n(2022).\\n[53] E. Union, Proposal for a REGULATION OF THE EUROPEAN\\nPARLIAMENT AND OF THE COUNCIL on harmonised rules on\\nfair access to and use of data (Data Act) (2022).\\n[54] A. Barredo Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot,\\nS. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Ben-\\njamins, et al., Explainable Artificial Intelligence (XAI): Concepts,\\ntaxonomies, opportunities and challenges toward responsible AI,\\nInformation Fusion 58 (2020) 82–115.\\n[55] K. Haresamudram, S. Larsson, F. Heintz, Three levels of AI trans-\\nparency, Computer 56 (2) (2023) 93–100.\\n[56] B. Pérez, J. Rubio, C. Sáenz-Adán, A systematic review of prove-\\nnance systems, Knowledge and Information Systems 57 (2018) 495–\\n543.\\n[57] A. Holzinger, M. Dehmer, F. Emmert-Streib, R. Cucchiara, I. Au-\\ngenstein, J. Del Ser, W. Samek, I. Jurisica, N. Díaz-Rodríguez,\\nInformation fusion as an integrative cross-cutting enabler to achieve\\nrobust, explainable, and trustworthy medical artificial intelligence,\\nInformation Fusion 79 (2022) 263–278.\\n[58] S. Ali, T. Abuhmed, S. El-Sappagh, K. Muhammad, J. M. Alonso-\\nMoral, R. Confalonieri, R. Guidotti, J. Del Ser, N. Díaz-Rodríguez,\\nF. Herrera, Explainable Artificial Intelligence (XAI): What we know\\nand what is left to attain Trustworthy Artificial Intelligence, Infor-\\nmation Fusion (2023) 101805.\\n[59] M. T. Ribeiro, S. Singh, C. Guestrin, \\\"Why should I trust you?\\\"\\nExplaining the predictions of any classifier, in: Proceedings of\\nthe 22nd ACM SIGKDD International Conference on Knowledge\\nDiscovery and Data Mining, 2016, pp. 1135–1144.\\n[60] N. F. Rajani, B. McCann, C. Xiong, R. Socher, Explain yourself!\\nleveraging language models for commonsense reasoning, in: Pro-\\nceedings of the 57th Annual Meeting of the Association for Com-\\nputational Linguistics, Association for Computational Linguistics,\\nFlorence, Italy, 2019, pp. 4932–4942. doi:10.18653/v1/P19-1487.\\nURL https://aclanthology.org/P19-1487\\n[61] K. Abhishek, D. Kamath, Attribution-based xai methods in com-\\nputer vision: A review, arXiv preprint arXiv:2211.14736 (2022).\\n[62] R. Guidotti, A. Monreale, F. Giannotti, D. Pedreschi, S. Ruggieri,\\nF. Turini, Factual and counterfactual explanations for black box\\ndecision making, IEEE Intelligent Systems 34 (6) (2019) 14–23.\\n[63] J. van der Waa, E. Nieuwburg, A. Cremers, M. Neerincx, Evaluating\\nXAI: A comparison of rule-based and example-based explanations,\\nArtificial Intelligence 291 (2021) 103404.\\n[64] K. Kaczmarek-Majer, G. Casalino, G. Castellano, M. Dominiak,\\nO. Hryniewicz, O. Kamińska, G. Vessio, N. Díaz-Rodríguez, Ple-\\nnary: Explaining black-box models in natural language through\\nfuzzy linguistic summaries, Information Sciences 614 (2022) 374–\\n399.\\n[65] V. Bourgeais, F. Zehraoui, B. Hanczar, GraphGONet: a self-\\nexplaining neural network encapsulating the Gene Ontology graph\\nfor phenotype prediction on gene expression, Bioinformatics 38 (9)\\n(2022) 2504–2511.\\n[66] N. Díaz-Rodríguez, A. Lamas, J. Sanchez, G. Franchi, I. Donadello,\\nS. Tabik, D. Filliat, P. Cruz, R. Montes, F. Herrera, EXplainable\\nNeural-Symbolic Learning (X-NeSyL) methodology to fuse deep\\nlearning representations with expert knowledge graphs: The Mon-\\nuMAI cultural heritage use case, Information Fusion 79 (2022) 58–\\n83.\\n[67] L. Salewski, A. Koepke, H. Lensch, Z. Akata, CLEVR-X: A Visual\\nReasoning Dataset for Natural Language Explanations, in: Interna-\\ntional Workshop on Extending Explainable AI Beyond Deep Models\\nand Classifiers, Springer, 2022, pp. 69–88.\\n[68] G. Vilone, L. Longo, Notions of explainability and evaluation ap-\\nproaches for explainable artificial intelligence, Information Fusion\\n76 (2021) 89–106.\\n[69] I. Sevillano-Garcia, J. Luengo, F. Herrera, REVEL framework to\\nmeasure local linear explanations for black-box models: Deep learn-\\ning image classification case study, International Journal of Intelli-\\ngent Systems 2023 (2023) 8068569.\\n[70] I. Hupont, C. Fernández, Demogpairs: Quantifying the impact of\\ndemographic imbalance in deep face recognition, in: 14th IEEE\\nInternational Conference on Automatic Face & Gesture Recognition\\n(FG 2019), IEEE, 2019, pp. 1–7.\\n[71] M.-P. Fernando, F. Cèsar, N. David, H.-O. José, Missing the missing\\nvalues: The ugly duckling of fairness in machine learning, Interna-\\ntional Journal of Intelligent Systems 36 (7) (2021) 3217–3258.\\n[72] A. H. Gee, D. Garcia-Olano, J. Ghosh, D. Paydarfar, Explaining deep\\nclassification of time-series data with learned prototypes, in: CEUR\\nworkshop proceedings, Vol. 2429, NIH Public Access, 2019, p. 15.\\n[73] A. Cully, Y. Demiris, Quality and diversity optimization: A unifying\\nmodular framework, IEEE Transactions on Evolutionary Computa-\\ntion 22 (2) (2017) 245–259.\\n[74] S. Hajian, F. Bonchi, C. Castillo, Algorithmic bias: From discrim-\\nination discovery to fairness-aware data mining, in: Proceedings of\\nthe 22nd ACM SIGKDD International Conference on Knowledge\\nDiscovery and Data Mining, 2016, pp. 2125–2126.\\n[75] D. Pedreshi, S. Ruggieri, F. Turini, Discrimination-aware data min-\\ning, in: Proceedings of the 14th ACM SIGKDD International Con-\\nference on Knowledge Discovery and Data Mining, 2008, pp. 560–\\n568.\\n[76] N. Díaz-Rodríguez, G. Pisoni, Accessible cultural heritage through\\nexplainable artificial intelligence, in: Adjunct Publication of the 28th\\nACM Conference on User Modeling, Adaptation and Personaliza-\\ntion, 2020, pp. 317–324.\\n[77] B. Shneiderman, Human-centered AI, Oxford University Press,\\n2022.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 28 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n[78] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, A. Galstyan, A\\nsurvey on bias and fairness in machine learning, ACM Computing\\nSurveys (CSUR) 54 (6) (2021) 1–35.\\n[79] X. Gu, Z. Tianqing, J. Li, T. Zhang, W. Ren, K.-K. R. Choo, Pri-\\nvacy, accuracy, and model fairness trade-offs in federated learning,\\nComputers & Security 122 (2022) 102907.\\n[80] M. Du, R. Tang, W. Fu, X. Hu, Towards debiasing DNN models from\\nspurious feature influence, in: Proceedings of the AAAI Conference\\non Artificial Intelligence, Vol. 36, 2022, pp. 9521–9528.\\n[81] B. H. Zhang, B. Lemoine, M. Mitchell, Mitigating unwanted biases\\nwith adversarial learning, in: Proceedings of the 2018 AAAI/ACM\\nConference on AI, Ethics, and Society, 2018, pp. 335–340.\\n[82] U. Aïvodji, H. Arai, O. Fortineau, S. Gambs, S. Hara, A. Tapp,\\nFairwashing: the risk of rationalization, in: International Conference\\non Machine Learning, PMLR, 2019, pp. 161–170.\\n[83] U. Aïvodji, H. Arai, S. Gambs, S. Hara, Characterizing the risk of\\nfairwashing, Advances in Neural Information Processing Systems 34\\n(2021) 14822–14834.\\n[84] R. Baeza-Yates, Bias on the web, Communications of the ACM\\n61 (6) (2018) 54–61.\\n[85] A. Balayn, C. Lofi, G.-J. Houben, Managing bias and unfairness in\\ndata for decision support: a survey of machine learning and data\\nengineering approaches to identify and mitigate bias and unfairness\\nwithin data management and analytics systems, The VLDB Journal\\n30 (5) (2021) 739–768.\\n[86] J. Silberg, J. Manyika, Notes from the AI frontier: Tackling bias in\\nAI (and in humans), McKinsey Global Institute 1 (6) (2019).\\n[87] G. Smith, I. Rustagi, Mitigating Bias in Artificial Intelligence,\\nAn Equity Fluent Leadership Playbook, Berkeley Haas Center for\\nEquity, Gender and Leadership (2020).\\nURL https://haas.berkeley.edu/wp-content/uploads/UCB_Playbook_\\nR10_V2_spreads2.pdf\\n[88] A. Gulati, M. A. Lozano, B. Lepri, N. Oliver, BIASeD: Bringing\\nIrrationality into Automated System Design, in: Proceedings of the\\nThinking Fast and Slow and Other Cognitive Theories in AI (in\\nAAAI 2022 Fall Symposium), Vol. 3332, 2022.\\n[89] H. Suresh, J. Guttag, A framework for understanding sources of harm\\nthroughout the machine learning life cycle, in: Equity and access in\\nalgorithms, mechanisms, and optimization, 2021, pp. 1–9.\\n[90] S. Barocas, M. Hardt, A. Narayanan, Fairness and Machine Learn-\\ning: Limitations and Opportunities, fairmlbook.org, 2019, http:\\n//www.fairmlbook.org.\\n[91] J. Pearl, D. Mackenzie, The Book of Why, Basic Books, 2018.\\n[92] N. Díaz-Rodríguez, R. Binkyt˙e, W. Bakkali, S. Bookseller,\\nP. Tubaro, A. Bacevičius, S. Zhioua, R. Chatila, Gender and sex bias\\nin COVID-19 epidemiological data through the lenses of causality,\\nInformation Processing & Management 60 (3) (2023) 103276. doi:\\nhttps://doi.org/10.1016/j.ipm.2023.103276.\\nURL https://www.sciencedirect.com/science/article/pii/S0306457\\n323000134\\n[93] D. Rolnick, P. L. Donti, L. H. Kaack, K. Kochanski, A. La-\\ncoste, K. Sankaran, A. S. Ross, N. Milojevic-Dupont, N. Jaques,\\nA. Waldman-Brown, et al., Tackling climate change with machine\\nlearning, ACM Computing Surveys (CSUR) 55 (2) (2022) 1–96.\\n[94] S. Salcedo-Sanz, J. Pérez-Aracil, G. Ascenso, J. Del Ser, D. Casillas-\\nPérez, C. Kadow, D. Fister, D. Barriopedro, R. García-Herrera,\\nM. Restelli, et al., Analysis, characterization, prediction and attribu-\\ntion of extreme atmospheric events with machine learning: a review,\\narXiv preprint arXiv:2207.07580 (2022).\\n[95] J. Cowls, A. Tsamados, M. Taddeo, L. Floridi, The AI gambit:\\nleveraging artificial intelligence to combat climate change – opportu-\\nnities, challenges, and recommendations, AI & Society (2021) 1–25.\\n[96] K. Hao, Training a single AI model can emit as much carbon as five\\ncars in their lifetimes, MIT technology Review 75 (2019) 103.\\n[97] E. Strubell, A. Ganesh, A. McCallum, Energy and policy consid-\\nerations for deep learning in NLP, in: Proceedings of the 57th\\nAnnual Meeting of the Association for Computational Linguistics,\\nAssociation for Computational Linguistics, Florence, Italy, 2019, pp.\\n3645–3650. doi:10.18653/v1/P19-1355.\\nURL https://aclanthology.org/P19-1355\\n[98] D. Patterson, J. Gonzalez, U. Hölzle, Q. Le, C. Liang, L.-M.\\nMunguia, D. Rothchild, D. R. So, M. Texier, J. Dean, The carbon\\nfootprint of machine learning training will plateau, then shrink,\\nComputer 55 (7) (2022) 18–28.\\n[99] C.-J. Wu, R. Raghavendra, U. Gupta, B. Acun, N. Ardalani,\\nK. Maeng, G. Chang, F. Aga, J. Huang, C. Bai, et al., Sustainable\\nAI: Environmental implications, challenges and opportunities, Pro-\\nceedings of Machine Learning and Systems 4 (2022) 795–813.\\n[100] R. Schwartz, J. Dodge, N. A. Smith, O. Etzioni, Green AI, Commu-\\nnications of the ACM 63 (12) (2020) 54–63.\\n[101] A. Lacoste, A. Luccioni, V. Schmidt, T. Dandres, Quantify-\\ning the carbon emissions of machine learning, arXiv preprint\\narXiv:1910.09700 (2019).\\n[102] J. Maillo, I. Triguero, F. Herrera, Redundancy and complexity\\nmetrics for big data classification: Towards smart data, IEEE Access\\n8 (2020) 87918–87928.\\n[103] G. C. Marinó, A. Petrini, D. Malchiodi, M. Frasca, Deep neural\\nnetworks compression: A comparative survey and choice recom-\\nmendations, Neurocomputing 520 (2023) 152–170.\\n[104] R. Mishra, H. P. Gupta, T. Dutta, A survey on deep neural network\\ncompression: Challenges, overview, and solutions, arXiv preprint\\narXiv:2010.03954 (2020).\\n[105] D. Becking, M. Dreyer, W. Samek, K. Müller, S. Lapuschkin, ECQ:\\nExplainability-Driven Quantization for Low-Bit and Sparse DNNs,\\nin: International Workshop on Extending Explainable AI Beyond\\nDeep Models and Classifiers, Springer, 2022, pp. 271–296.\\n[106] G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural\\nnetwork, arXiv preprint arXiv:1503.02531 (2015).\\n[107] R. Traoré, H. Caselles-Dupré, T. Lesort, T. Sun, N. Díaz-Rodríguez,\\nD. Filliat, Continual reinforcement learning deployed in real-life\\nusing policy distillation and Sim2Real transfer, in: ICML Workshop\\non Multi-Task and Lifelong Reinforcement Learning, 2019.\\n[108] Y. Cheng, D. Wang, P. Zhou, T. Zhang, A survey of model com-\\npression and acceleration for deep neural networks, arXiv preprint\\narXiv:1710.09282 (2017).\\n[109] R. V. Zicari, J. Amann, F. Bruneault, M. Coffee, B. Düdder, E. Hick-\\nman, A. Gallucci, T. K. Gilbert, T. Hagendorff, I. van Halem,\\net al., How to assess trustworthy AI in practice, arXiv preprint\\narXiv:2206.09887 (2022).\\n[110] ISO/IEC, ISO/IEC TR 24029-1, Information technology — Arti-\\nficial Intelligence (AI) – Assessment of the robustness of neural\\nnetworks - Part 1: Overview, https://www.iso.org/standard/776\\n09.html (2021).\\n[111] D. V. Carvalho, E. M. Pereira, J. S. Cardoso, Machine learning\\ninterpretability: A survey on methods and metrics, Electronics 8 (8)\\n(2019) 832.\\n[112] J. H.-w. Hsiao, H. H. T. Ngai, L. Qiu, Y. Yang, C. C. Cao, Roadmap\\nof designing cognitive metrics for explainable artificial intelligence\\n(XAI), arXiv preprint arXiv:2108.01737 (2021).\\n[113] A. Rosenfeld, Better metrics for evaluating explainable artificial\\nintelligence, in: Proceedings of the 20th International Conference\\non Autonomous Agents and MultiAgent Systems, 2021, pp. 45–50.\\n[114] R. R. Hoffman, S. T. Mueller, G. Klein, J. Litman, Metrics\\nfor explainable AI: Challenges and prospects, arXiv preprint\\narXiv:1812.04608 (2018).\\n[115] F. Sovrano, S. Sapienza, M. Palmirani, F. Vitali, A survey on\\nmethods and metrics for the assessment of explainability under the\\nproposed AI Act, in: The Thirty-fourth Annual Conference on Legal\\nKnowledge and Information Systems (JURIX), Vol. 346, IOS Press,\\n2022, p. 235.\\n[116] A.-H. Karimi, J. von Kügelgen, B. Schölkopf, I. Valera, Towards\\ncausal algorithmic recourse, in: International Workshop on Extend-\\ning Explainable AI Beyond Deep Models and Classifiers, Springer,\\n2022, pp. 139–166.\\n[117] C. Novelli, M. Taddeo, L. Floridi, Accountability in artificial intel-\\nligence: what it is and how it works, AI & Society (2023) 1–12.\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 29 of 30\\nConnecting the Dots in Trustworthy Artificial Intelligence\\n[118] A. Institute, Algorithmic Accountability Policy Toolkit (2018).\\nURL https://ainowinstitute.org/aap-toolkit.pdf\\n[119] B. Kim, F. Doshi-Velez, Machine learning techniques for account-\\nability, AI Magazine 42 (1) (2021) 47–52.\\n[120] B. Xia, Q. Lu, H. Perera, L. Zhu, Z. Xing, Y. Liu, J. Whittle, Towards\\nconcrete and connected AI risk assessment (C2AIRA): A systematic\\nmapping study (2023). arXiv:2301.11616.\\n[121] J. A. Baquero, R. Burkhardt, A. Govindarajan, T. Wallace, Derisking\\nAI by design: How to build risk management into AI development,\\nMcKinsey & Company (2020).\\n[122] J. Mökander, M. Axente, F. Casolari, L. Floridi, Conformity assess-\\nments and post-market monitoring: A guide to the role of auditing\\nin the proposed european AI regulation, Minds and Machines 32 (2)\\n(2022) 241–268.\\n[123] A. Holzinger, A. Carrington, H. Müller, Measuring the quality of\\nexplanations: the system causability scale (SCS) comparing human\\nand machine explanations, KI-Künstliche Intelligenz 34 (2) (2020)\\n193–198.\\n[124] Z. Han, H. Yanco, Communicating missing causal information to\\nexplain a robot’s past behavior, ACM Transactions on Human-Robot\\nInteraction 12 (1) (2023) 1–45.\\n[125] M. Brundage, S. Avin, J. Wang, H. Belfield, G. Krueger, G. Hadfield,\\nH. Khlaaf, J. Yang, H. Toner, R. Fong, et al., Toward trustworthy\\nAI development: mechanisms for supporting verifiable claims, arXiv\\npreprint arXiv:2004.07213 (2020).\\n[126] L. Bourtoule, V. Chandrasekaran, C. A. Choquette-Choo, H. Jia,\\nA. Travers, B. Zhang, D. Lie, N. Papernot, Machine unlearning, in:\\nIEEE Symposium on Security and Privacy (SP), IEEE, 2021, pp.\\n141–159.\\n[127] S. Verma, V. Boonsanong, M. Hoang, K. E. Hines, J. P. Dickerson,\\nC. Shah, Counterfactual explanations and algorithmic recourses\\nfor machine learning: A review, in: NeurIPS 2020 Workshop: ML\\nRetrospectives, Surveys & Meta-Analyses (ML-RSA), 2020.\\n[128] S. Barocas, A. D. Selbst, Big data’s disparate impact, California Law\\nReview (2016) 671–732.\\n[129] L. Floridi, M. Holweg, M. Taddeo, J. Amaya Silva, J. Mökander,\\nY. Wen, CapAI-A procedure for conducting conformity assessment\\nof AI systems in line with the EU artificial intelligence act, Available\\nat SSRN 4064091 (2022).\\n[130] R. Parenti, Regulatory sandboxes and innovation hubs for fintech,\\nStudy for the Committee on Economic and Monetary Affairs, Policy\\nDepartment for Economic, Scientific and Quality of Life Policies,\\nEuropean Parliament, Luxembourg (2020) 65.\\n[131] F. Pop, L. Adomavicius, Sandboxes for responsible artificial intelli-\\ngence. eipa briefing september 2021. (2021).\\n[132] K. Yordanova, The EU AI Act-Balancing human rights and innova-\\ntion through regulatory sandboxes and standardization (2022).\\n[133] J. Soler Garrido, S. Tolan, I. Hupon Torres, D. Fernandez Llorca,\\nV. Charisi, E. Gomez Gutierrez, H. Junklewitz, R. Hamon, D. Fano\\nYela, C. Panigutti, AI Watch: Artificial intelligence standardisation\\nlandscape update, Tech. rep., Joint Research Centre (Seville site)\\n(2023).\\n[134] T. Madiega, A. L. Van De Pol, Artificial intelligence act and regu-\\nlatory sandboxes. EPRS European Parliamentary Research Service.\\nJune 2022 (2022).\\nURL https://www.europarl.europa.eu/RegData/etudes/BRIE/2022/73\\n3544/EPRS_BRI(2022)733544_EN.pdf\\n[135] Coalition for Health AI (CHAI), Blueprint for trustworthy AI imple-\\nmentation guidance and assurance for healthcare (2023).\\nURL https://www.coalitionforhealthai.org/papers/Blueprint%20fo\\nr%20Trustworthy%20AI.pdf\\n[136] J. Zhang, Z.-M. Zhang, Ethics and governance of trustworthy med-\\nical artificial intelligence, BMC Medical Informatics and Decision\\nMaking 23 (1) (2023) 1–15.\\n[137] J. Amann, D. Vetter, S. N. Blomberg, H. C. Christensen, M. Coffee,\\nS. Gerke, T. K. Gilbert, T. Hagendorff, S. Holm, M. Livne, et al., To\\nexplain or not to explain?– Artificial intelligence explainability in\\nclinical decision support systems, PLOS Digital Health 1 (2) (2022)\\ne0000016.\\n[138] R. V. Zicari, J. Brodersen, J. Brusseau, B. Düdder, T. Eichhorn,\\nT. Ivanov, G. Kararigas, P. Kringen, M. McCullough, F. Möslein,\\net al., Z-inspection®: a process to assess trustworthy AI, IEEE\\nTransactions on Technology and Society 2 (2) (2021) 83–97.\\n[139] H. Muller, M. T. Mayrhofer, E.-B. Van Veen, A. Holzinger, The\\nten commandments of ethical medical AI, Computer 54 (07) (2021)\\n119–123.\\n[140] K. Stöger, D. Schneeberger, A. Holzinger, Medical artificial intelli-\\ngence: the european legal perspective, Communications of the ACM\\n64 (11) (2021) 34–36.\\n[141] J. Baker-Brunnbauer, TAII Framework for Trustworthy AI systems,\\nROBONOMICS: The Journal of the Automated Economy 2 (2021)\\n17.\\n[142] Editorials, Writing the rules in ai-assisted writing, Nature Machine\\nIntelligence 469 (5) (2023) 469–469. doi:https://doi.org/10.1038/\\ns42256-023-00678-6.\\n[143] C. T. Zirpoli, Generative artificial intelligence and copyright law,\\nUnited States Congressional Research Service, CRS Legal Sidebar,\\n(February 23, 10922 (5 pages) (2023).\\n[144] J. Tang, A. LeBel, S. Jain, A. G. Huth, Semantic reconstruction\\nof continuous language from non-invasive brain recordings, Nature\\nNeuroscience (2023) 1–9.\\n[145] S. Reardon, Mind-reading machines are here: is it time to worry?,\\nNature 617 (7960) (2023) 236–236.\\n[146] R. Burnell, W. Schellaert, J. Burden, T. D. Ullman, F. Martinez-\\nPlumed, J. B. Tenenbaum, D. Rutar, L. G. Cheke, J. Sohl-Dickstein,\\nM. Mitchell, et al., Rethink reporting of evaluation results in ai,\\nScience 380 (6641) (2023) 136–138.\\n[147] P. Liang, R. Bommasani, T. Lee, D. Tsipras, D. Soylu, M. Yasunaga,\\nY. Zhang, D. Narayanan, Y. Wu, A. Kumar, et al., Holistic evaluation\\nof language models, arXiv preprint arXiv:2211.09110 (2022).\\n[148] T. Hernandez-Boussard, S. Bozkurt, J. P. Ioannidis, N. H. Shah,\\nMINIMAR (MINimum Information for Medical AI Reporting): de-\\nveloping reporting standards for artificial intelligence in health care,\\nJournal of the American Medical Informatics Association 27 (12)\\n(2020) 2011–2015.\\n[149] J. Schuett, A. Reuel, A. Carlier, How to design an AI ethics board,\\narXiv preprint arXiv:2304.07249 (2023).\\nN. Díaz-Rodríguez, J. Del Ser et al.: Preprint submitted to Elsevier\\nPage 30 of 30\\n\\n\\n\\n---\\n\\n\\n Three lines of defense against risks from AI \\nJonas Schuett* \\n \\nOrganizations that develop and deploy artificial intelligence (AI) systems need to \\nmanage the associated risks—for economic, legal, and ethical reasons. However, it is \\nnot always clear who is responsible for AI risk management. The Three Lines of De-\\nfense (3LoD) model, which is considered best practice in many industries, might offer \\na solution. It is a risk management framework that helps organizations to assign and \\ncoordinate risk management roles and responsibilities. In this article, I suggest ways \\nin which AI companies could implement the model. I also discuss how the model \\ncould help reduce risks from AI: it could identify and close gaps in risk coverage, \\nincrease the effectiveness of risk management practices, and enable the board of di-\\nrectors to oversee management more effectively. The article is intended to inform \\ndecision-makers at leading AI companies, regulators, and standard-setting bodies. \\n1 Introduction \\nOrganizations that develop and deploy artificial intelligence (AI) systems need \\nto manage the associated risks—for economic reasons, because accidents and \\ncases of misuse can threaten business performance (Cheatham, Javanmardian, \\n& Hamid Samandari, 2019), for legal reasons, because upcoming AI regulation \\nmight require them to implement a risk management system (Schuett, 2022), \\nand for ethical reasons, because AI could have large and long-lasting impacts \\non society (Clarke & Whitlestone, 2022). \\n \\nHowever, it is not always clear who is responsible for AI risk management: \\nThe researchers and engineers? The legal and compliance department? The \\ngovernance team? The Three Lines of Defense (3LoD) model might offer a \\nsolution. It is a risk management framework intended to improve an organiza-\\ntion’s risk governance by assigning and coordinating risk management roles \\nand responsibilities (Institute of Internal Auditors [IIA], 2013, 2020a). It is \\nconsidered best practice in many industries, such as finance and aviation. In \\nthis article, I apply the 3LoD model to an AI context. \\n \\nTo date, there has not been much academic work on the intersection of AI \\nand the 3LoD model. Nunn (2020) suggests using the model to reduce \\n__________ \\n* Research Fellow, Centre for the Governance of AI, Oxford, UK; Research Affiliate, \\nLegal Priorities Project, Cambridge, MA, USA; PhD Candidate, Faculty of Law, Goethe \\nUniversity Frankfurt, Germany; jonas.schuett@governance.ai. \\nThree lines of defense against risks from AI \\n2 \\ndiscrimination risks from AI, but the relevant passage is very short. There is \\nalso some literature on how companies could use AI to support the three lines \\n(Tammenga, 2020; Sekar, 2022), but I am mainly interested in how to govern \\nAI companies, not how to use AI to govern non-AI companies. It has also been \\nproposed that governments could use the 3LoD model to manage extreme risks \\nfrom AI (Ord, 2021), but here I focus on the challenges of companies, not gov-\\nernment. \\n \\nWhile academic scholarship on this topic may be limited, there is some rel-\\nevant work from practitioners. Most notably, there is a blog post by PwC that \\nseeks to answer questions similar to this article (Rao & Golbin, 2021). But \\nsince they only dedicate a short section to the 3LoD model, their proposal only \\nscratches the surface. The IIA has also published a three-part series, in which \\nthey propose an AI auditing framework (IIA, 2017a, 2017c, 2018). Although \\ntheir proposal contains a reference to the 3LoD model, it does not play a key \\nrole. Finally, the 3LoD model is mentioned in a playbook that the National \\nInstitute of Standards and Technology (NIST) published alongside the second \\ndraft of its AI Risk Management Framework (NIST, 2022a). However, the \\nplaybook only suggests implementing the 3LoD model (or a related mecha-\\nnism), it does not specify how to do so. \\n \\nTaken together, there are at least two gaps in the current literature. The first \\none is practical: there does not seem to be a concrete proposal for how organi-\\nzations that develop and deploy AI systems could implement the 3LoD model. \\nThe few proposals that exist are not detailed enough to provide meaningful \\nguidance. The second one is normative: there does not seem to be a thorough \\ndiscussion about whether implementing the model is even desirable. Given that \\nthe model has been criticized and there is not much empirical evidence for its \\neffectiveness, the answer to this question is not obvious. In light of this, the \\narticle seeks to answer two research questions: (1) How could organizations \\nthat develop and deploy AI systems implement the 3LoD model? (2) To what \\nextent would implementing the 3LoD model help reduce risks from AI? \\n \\nThe article has three areas of focus. First, it focuses on organizations that \\ndevelop and deploy state-of-the-art AI systems, in particular medium-sized re-\\nsearch labs (e.g. DeepMind and OpenAI) and big tech companies (e.g. Google \\nand Microsoft), though the boundaries between the two categories are blurry \\n(e.g. DeepMind is a subsidiary of Alphabet and OpenAI has a strategic part-\\nnership with Microsoft). In the following, I use the term “AI companies” to \\nrefer to all of them. I do not cover other types of companies (e.g. hardware \\ncompanies) or academic institutions, but they might also benefit from my anal-\\nysis. Second, the article focuses on the organizational dimension of AI risk \\nmanagement. It is not about how AI companies should identify, assess, and \\nrespond to risks from AI. Instead, it is about how they should assign and coor-\\ndinate risk management roles and responsibilities. Third, the article focuses on \\nthe model’s ability to prevent individual, collective, or societal harm (Smuha, \\n2021). I am less interested in risks to companies themselves (e.g. litigation or \\nJonas Schuett \\n3 \\nreputation risks), though occasionally private and public interests are aligned \\n(e.g. one way to reduce litigation risks is to prevent accidents). \\n \\nThe remainder of this article proceeds as follows. Section 2 gives an over-\\nview of the model’s basic structure, history, criticisms, and evidence base. Sec-\\ntion 3 suggests ways in which AI companies could implement the model. Sec-\\ntion 4 discusses how the model could help reduce risks from AI. Section 5 con-\\ncludes and suggests questions for further research.\\n2 The 3LoD model \\nIn this section, I give an overview of the basic structure (Section 2.1) and his-\\ntory of the 3LoD model (Section 2.2). I also engage with some of the main \\ncriticisms, briefly discuss alternative models (Section 2.3), and review the em-\\npirical evidence for its effectiveness (Section 2.4). \\n2.1 Basic structure \\nThere are different versions of the 3LoD model. Most practitioners and schol-\\nars are familiar with the version published by the IIA (2013). After a review \\nprocess, they published an updated version (IIA, 2020a), which increasingly \\nreplaces the original version. This article will mainly use the updated version, \\nas illustrated in Figure 1. The updated model has three types of elements: ac-\\ntors, roles, and relationships. \\n \\n \\n \\nFigure 1: The 3LoD model as described by the IIA (2020a) \\n \\nThe model distinguishes between four actors, represented as blue boxes: the \\ngoverning body, which is accountable to stakeholders for organizational over-\\nsight; management, which takes actions to achieve the organization’s objec-\\ntives; internal audit, which provides independent assurance to the governing \\nbody, as do external assurance providers. \\nThree lines of defense against risks from AI \\n2 \\n \\nThe model further distinguishes between four roles, represented as gray \\nboxes. The role of the governing body is to demonstrate integrity, leadership, \\nand transparency. In addition to that, the model contains three roles which it \\ncalls “lines of defense”. The first line provides products and services to clients, \\nand manages the associated risks. The second line assists the first line with \\nregards to risk management. It provides complementary expertise and support, \\nbut also monitors and challenges risk management practices. The third line \\nprovides independent and objective assurance and advice on all matters related \\nto the achievement of risk objectives. The first two lines are part of manage-\\nment, while the third line is synonymous with internal audit. \\n \\nFinally, there are three types of relationships between different actors, rep-\\nresented as arrows. There are top-down relationships: the governing body del-\\negates responsibility to management and oversees internal audit. Inversely, \\nthere are bottom-up relationships: management and internal audit are account-\\nable and report to the governing body. And lastly, there is a horizontal rela-\\ntionship between actors whose work must be aligned, namely between man-\\nagement and internal audit. \\n2.2 Brief history \\nThe model’s origins are opaque. There are theories suggesting military, sport-\\ning, or quality control origins (Davies & Zhivitskaya, 2018). It was presumably \\ndeveloped in the late 1990s or early 2000s. In 1999, the Basel Committee on \\nBanking Supervision (BCBS) suggested a similar approach to risk oversight \\n(BCBS, 1999), but the first explicit mention of the model was probably in a \\nreport by the UK Financial Services Authority (2003) or a paper by Roman \\nKräussl (2003). \\n \\nAfter the financial crisis of 2007-2008, which was partly caused by wide-\\nspread risk management failures (Boatright, 2016), the model’s popularity sky-\\nrocketed. In response to the crisis, regulators and supervisory authorities paid \\nincreasing attention to the chief risk officer (CRO) and the risk committee of \\nthe board (Walker, 2009; Davies & Zhivitskaya, 2018), and started recom-\\nmending the 3LoD model (BCBS, 2012; European Banking Authority, 2021). \\nMost academic work on the model was also done after the crisis (e.g. Davies \\n& Zhivitskaya, 2018; Bantleon et al., 2021) and many risk management pro-\\nfessionals only heard about the model in its aftermath (Zhivitskaya, 2015). \\n \\nToday, most listed companies have implemented the 3LoD model. In a 2015 \\nsurvey of internal audit professionals in 166 countries (n=14,518), the majority \\nof respondents (75%) reported that their organization follows the 3LoD model \\nas articulated by the IIA (Huibers, 2015).1 Another survey, conducted in 2021 \\namong chief audit executives (CAEs) in Austria, Germany, and Switzerland \\n(n=415), supports their findings (Bantleon et al., 2021). The majority of \\n__________ \\n1 Note that respondents who said they were not familiar with the model were excluded. \\nJonas Schuett \\n3 \\nrespondents (88%) reported that they had implemented the model, with partic-\\nularly high adoption rates among financial institutions (96%). \\n \\nIn contrast, big tech companies do not seem to have implemented the 3LoD \\nmodel. It is not mentioned in any of their filings to the US Securities and Ex-\\nchange Commission (SEC) or other publications. The model is also not explic-\\nitly mentioned in the corporate governance requirements by Nasdaq (2022), \\nwhere all big tech companies are listed. It is worth noting, however, that the \\nrisk oversight practices at big tech companies do have some similarities with \\nthe 3LoD model. For example, they all seem to have an internal audit function \\n(e.g. Microsoft, 2022; Alphabet, 2022). Based on public information, medium-\\nsized AI research labs do not seem to have implemented the model either. \\n2.3 Criticisms and alternative models \\nDespite the model’s popularity in many industries, it has also been criticized \\n(Arndorfer & Minto, 2015; Zhivitskaya, 2015; Davies & Zhivitskaya, 2018; \\nHoefer, Cooke, & Curry, 2020; Vousinas, 2021). Arndorfer and Minto (2015) \\nidentify four weaknesses and past failures of the 3LoD model. First, they argue, \\nthe incentives for risk-takers in the first line are often misaligned. When facing \\na tradeoff between generating profits and reducing risks, they have historically \\nbeen incentivized to prioritize the former. Second, there is often a lack of or-\\nganizational independence for second line functions. They are too close to \\nprofit-seekers, which can lead to the adoption of more risk-taking attitudes. \\nThird, second line functions often lack the necessary skills and expertise to \\nchallenge practices and controls in the first line. And fourth, the effectiveness \\nof internal audit depends on the knowledge, skills, and experience of individ-\\nuals, which might be inadequate. Another common criticism is that the model \\nprovides a false sense of security. Put simply, “when there are several people \\nin charge—no one really is” (Davies & Zhivitskaya, 2018). Another criticism \\nis that the model is too bureaucratic and costly. Additional layers of oversight \\nmight reduce risk, but they come at the cost of efficiency (Zhivitskaya, 2015). \\nA final criticism is that the model depends on information flow between the \\nlines, but there are many barriers to this. For example, the second line might \\nnot recognize that they only see what the first line chooses to show them \\n(Zhivitskaya, 2015). While these criticisms identify relevant shortcomings and \\nshould be taken seriously, they do not put into question the model as a whole. \\nMoreover, the 3LoD model has been improved over the years. Today, the focus \\nis on increasing the model’s effectiveness and responding to criticisms (Davies \\n& Zhivitskaya, 2018). \\n \\nIn view of these criticisms, several alternative models have been suggested. \\nFor example, Arndorfer and Minto (2015) proposed the Four Lines of Defense \\n(4LoD) model to better meet the needs of financial institutions. The fourth line \\nconsists of supervisory authorities and external audit, who are supposed to \\nwork closely with internal audit. Another example is the Five Lines of Assur-\\nance (5LoA) model, which was gradually developed by several scholars and \\nThree lines of defense against risks from AI \\n4 \\norganizations (Leech & Hanlon, 2016). However, the proposed changes do not \\nnecessarily improve the model. It has been argued that adding more lines would \\nover-complicate the model, and that firms and regulators currently do not want \\nstructural changes (Davies & Zhivitskaya, 2018). It is also worth noting that \\nthe alternative models are far less popular than the original model. Compared \\nto these alternative models, the 3LoD model remains “the most carefully artic-\\nulated risk management system that has so far been developed” (Davies & \\nZhivitskaya, 2018). But what empirical evidence do we have for its effective-\\nness? \\n2.4 Empirical evidence \\nBy “effectiveness”, I mean the degree to which the model helps organizations \\nto achieve their objectives. For the purpose of this article, I am mostly inter-\\nested in the achievement of risk objectives. This may include: (1) reducing \\nrelevant risks to an acceptable level (e.g. risks of harm to individuals, groups, \\nand society), (2) ensuring that management and the board of directors are aware \\nof the nature and scale of key risks, which allows them to define the organiza-\\ntion’s risk appetite (COSO, 2017), and (3) compliance with relevant risk regu-\\nlations (Schuett, 2022). I am less interested in other objectives (e.g. improving \\nfinancial performance), though there might be overlaps (e.g. reducing the risk \\nof harm to individuals might also reduce the risk of financial losses from liti-\\ngation cases). For an overview of different ways to measure the effectiveness \\nof internal audit, see Rupšys and Boguslauskas (2007), Savčuk (2007), and \\nBoţa-Avram and Palfi (2009). \\n \\nThere do not seem to be any (high-quality) studies on the effectiveness of \\nthe 3LoD model in the above-mentioned sense.2 There only seems to be evi-\\ndence for the effectiveness of internal audit (Lenz & Hahn, 2015; Eulerich & \\nEulerich, 2020). For example, a survey of CAEs at multinational companies in \\nGermany (n=37) compared audited and non-audited business units within the \\nsame company (Carcello et al., 2020). They found that managers of audited \\nunits perceive a greater decline in risk compared to managers of non-audited \\nunits. Other studies find that internal audit helps to strengthen internal control \\nsystems (Lin et al., 2011; Oussii & Taktak, 2018) and has a positive influence \\non the prevention and identification of fraud (Coram, Ferguson, & Moroney, \\n2008; Ma’ayan & Carmeli 2016; Drogalas et al., 2017). The fact that the 3LoD \\nmodel was not able to prevent past scandals and crises seems to provide weak \\n__________ \\n2 There is also not much evidence on the model’s effectiveness based on other interpre-\\ntations of effectiveness. The only exception seems to be a recent study of the 500 largest \\ncompanies in Denmark, which finds that a higher degree of adherence to first and second \\nline practices is positively associated with financial performance (Andersen, Sax, & Gian-\\nnozzi, 2022). Besides that, there are only studies on the effects of internal audit (Lenz & \\nHahn, 2015; Eulerich & Eulerich, 2020; Jiang, Messier, & Wood, 2020), none of which \\nmentions the 3LoD model. \\nJonas Schuett \\n5 \\nevidence against its effectiveness (though another explanation could be that the \\nmodel was poorly implemented in these cases), while the model’s ongoing \\npopularity seems to provide weak evidence in favor of its effectiveness (though \\nthe model’s popularity could also be explained by path dependencies). Finally, \\nthere is anecdotal evidence in both directions (Zhivitskaya, 2015). \\n \\nOverall, despite the model’s popularity, “its effectiveness [remains] un-\\ntested” (Davies & Zhivitskaya, 2018) and “not based on any clear evidence” \\n(Power, Ashby, & Palermo, 2013). To be clear, it is not the case that we have \\nrobust evidence that the model is ineffective. It is still very plausible that the \\nmodel can be effective, but there have not been (high-quality) studies providing \\nempirical evidence for its effectiveness in the above-mentioned sense. \\n \\nThis surprising lack of evidence could be explained by the following, rather \\nspeculative reasons. First, since it is not feasible to run randomized controlled \\ntrials on organizational interventions, it is inherently difficult to collect robust \\nevidence. Second, the model is designed to be flexible and adaptable, which \\nmeans that there is not a single, standardized way to implement it. This lack of \\nstandardization can make it difficult to compare different implementations of \\nthe model and to assess their effectiveness.3 Third, since most practitioners \\nmainly care about financial performance, scholars might be incentivized to fo-\\ncus on that to justify the relevance of their work (though there is not much \\nevidence on that either). \\n \\nEven if we had more empirical evidence from other industries, its informa-\\ntive value might still be limited. One reason is that findings might not general-\\nize to an AI context. AI companies are structurally different from other com-\\npanies because they have a special focus on research, and, since AI is a general-\\npurpose technology (Crafts, 2021), risks from AI are broader than risks from \\nother products and services. Another reason is that the biggest driver of the \\nmodel’s ability to reduce risks is likely the concrete way in which it is imple-\\nmented. Instead of asking “is the 3LoD model effective?”, AI companies \\nshould ask “how can we implement the model in an effective way?”. \\n3 Applying the 3LoD model to an AI context \\nThis section suggests ways in which AI companies could implement the 3LoD \\nmodel. For each of the three lines, I suggest equivalent roles and responsibili-\\nties. First, I describe the content of their responsibilities, then I discuss which \\nteam or individual would be responsible, as illustrated in Figure 2. \\n \\n__________ \\n3 This argument was suggested by OpenAI’s language model GPT-3. \\nThree lines of defense against risks from AI \\n6 \\n \\n \\nFigure 2: Sample org chart of an AI company with equivalent responsibilities \\nfor each of the three lines \\n3.1 First line \\nThe first line has two main responsibilities: providing products and services to \\nclients, which corresponds to AI research and product development, and man-\\naging the associated risks. Below, I focus on the latter. \\n \\nThe first line is responsible for establishing and maintaining appropriate \\nstructures and processes for the management of risk. This involves measures \\nalong all steps of the risk management process. For example, to identify risks \\nfrom AI, the first line could use risk taxonomies (Microsoft, 2020; Weidinger \\net al., 2021; Raji et al., 2022), incident databases (McGregor, 2021), or sce-\\nnario planning and wargaming (International Electrotechnical Commission \\n[IEC], 2019; Gyengo & Bruner, 2022). To estimate the likelihood and severity \\nof the identified risks, and to assess potential vulnerabilities, the first line might \\nuse Bayesian networks, Monte Carlo simulations, or penetration testing (IEC, \\n2019; International Organization for Standardization [ISO] & IEC, 2022). To \\nreduce risks, it could fine-tune the model on a curated dataset (Solaiman & \\nDennison, 2021), introduce a policy for the publication of potentially harmful \\nresearch (Partnership on AI, 2021; Solaiman et al., 2019), or only grant struc-\\ntured access to models (e.g. via an API) to reduce misuse risks (Shevelane, \\n2022). The first line could also take a more holistic approach and implement \\nan AI-specific risk management framework (e.g. NIST, 2022b; ISO & IEC, \\nn.d.) or customize a more general enterprise risk management (ERM) frame-\\nwork (e.g. ISO, 2018; Committee of Sponsoring Organizations of the Tread-\\nway Commission [COSO], 2017). \\n \\nThe first line is also responsible for ensuring compliance with legal, regula-\\ntory, and ethical expectations. Legal obligations might stem from anti-discrim-\\nination law (Wachter, Mittelstadt, & Russell, 2021), data protection law \\nJonas Schuett \\n7 \\n(Hamon et al., 2022), or antitrust law (Hua & Belfied, 2021). A notable exam-\\nple of AI regulation is the proposed EU AI Act (European Commission, 2021), \\nwhich requires providers of high-risk AI systems to implement a risk manage-\\nment system (Schuett, 2022). Ethical expectations might stem from AI ethics \\nprinciples that organizations have adopted on a voluntary basis (Jobin, Ienca, \\n& Vayena, 2019). To ensure compliance, the first line relies on support from \\nthe second line (see below). \\n \\nFinally, the first line is responsible for informing the governing body about \\nthe outcomes of the above-mentioned measures, the degree to which risk ob-\\njectives are met, and the overall level of risk. This should take the form of a \\ncontinuous dialogue, including reporting about expected and actual outcomes. \\nReporting will typically include heat maps and risk registers (IEC, 2019), but \\nit could also involve information about specific models, in the form of model \\ncards (Mitchell et al., 2018), data sheets (Gebru et al., 2018), and system cards \\n(Green et al., 2021). Note that there should also be a reporting line from the \\nCRO to the chief executive officer (CEO) and the risk committee of the board \\n(see below). \\n \\nResponsible are operational managers, often in a cascading responsibility \\nstructure. At big tech companies, the lowest level of responsibility would lie \\nwith those managers who are in charge of the development of individual AI \\nproducts. If there is no stand-alone AI product and AI systems make up only \\npart of a product (e.g. WaveNet as a part of Google Assistant), then the lowest \\nlevel of responsibility would lie with those managers who lead the develop-\\nment of the AI part of the product (e.g. the research lead for WaveNet). At \\nmedium-sized research labs, the lowest level of responsibility for risk manage-\\nment would lie with research leads, i.e. senior researchers who are in charge of \\nindividual research projects. \\n \\nThere will usually be one or more intermediate levels of responsibility. This \\nmight include a number of mid-level managers responsible for broader product \\nareas (e.g. gaming) or research areas (e.g. reinforcement learning), though the \\ndetails depend on the particular organizational structures. The ultimate respon-\\nsibility for AI risk management lies with those C-suite executives who are re-\\nsponsible for product development (e.g. the chief technology officer [CTO]) or \\nresearch (e.g. the chief scientific officer [CSO]). While it is possible to split \\nresponsibilities between two or more executives, this is often not advisable, \\nmainly because it can dilute responsibilities. \\n3.2 Second line \\nThe second line is responsible for assisting the first line with regards to risk \\nmanagement. It provides complementary expertise and support, but also mon-\\nitors and challenges risk management practices. \\n \\nSome risk management activities require special expertise that the first line \\ndoes not have. This might include legal expertise (e.g. how to comply with the \\nrisk management requirements set out in the proposed EU AI Act [Schuett, \\nThree lines of defense against risks from AI \\n8 \\n2022]), technical expertise (e.g. how to develop more truthful language models \\n[Evans et al., 2021]), or ethical expertise (e.g. how to define normative thresh-\\nolds for fairness [Kleinberg, Mullainathan, & Raghavan, 2016]). It might also \\ninclude risk-specific expertise (e.g. what risks language models pose \\n[Weidinger et al., 2021]) or risk management-specific expertise (e.g. best prac-\\ntices for red teaming safety filters [Rando et al., 2022]). The second line could \\nsupport the first line by drafting policies, processes, and procedures, as well as \\nframeworks, templates, and taxonomies. It might also advise on specific issues \\n(e.g. how to customize a risk management framework to better meet the spe-\\ncific needs of the company), provide general guidance (e.g. how to ensure com-\\npliance with safety-related policies among researchers and engineers), or offer \\ntrainings (e.g. how to process training data in a GDPR compliant way). \\n \\nThe second line is also responsible for monitoring and challenging the ade-\\nquacy and effectiveness of risk management practices. Risk management prac-\\ntices are ineffective if risk objectives are not met (e.g. the company fails to \\ncomply with relevant laws and regulations, or it is unable to reduce risks to an \\nacceptable level). They are inadequate if the same results could have been \\nachieved with fewer resources. The second line will typically use a number of \\nkey performance indicators (KPIs) to evaluate various dimensions of the ade-\\nquacy and effectiveness of risk management (e.g. number of identified risks, \\nnumber of incidents, or percentage of personnel trained on specific matters). \\n \\nSecond line responsibilities are split across multiple teams. This typically \\nincludes the risk management team as well as the legal and compliance team. \\nAlthough most big tech companies already have a risk management team, these \\nteams are mostly concerned with business risks (e.g. litigation or reputation \\nrisk). Risks from AI, especially societal risks, are usually not a major concern \\n(Smuha, 2021). If big tech companies want to change this, they could expand \\nthe responsibilities of existing teams. Setting up a new AI-specific risk man-\\nagement team seems less desirable, as it could lead to a diffusion of responsi-\\nbilities. There would likely be a cascading responsibility structure where the \\nCRO acts as the single point of accountability for the risk management process. \\nMedium-sized research labs usually do not have a dedicated risk management \\nteam. They could either set up a new team or task one or more people in other \\nteams with risk management-related support functions. \\n \\nAll AI companies beyond the early startup phase have a legal and compli-\\nance team. The team lead, and ultimately the chief compliance officer (CCO) \\nor chief legal officer (CLO), would be responsible for risk-related legal and \\ncompliance support. It is worth noting that the legal and compliance team can \\nalso be part of the first line if they are actually responsible for ensuring com-\\npliance. They are part of the second line if they do not have any decision power \\nand only support the first line (e.g. by writing legal opinions). The legal and \\ncompliance team can also seek support from external law firms. \\n \\nMany organizations that develop and deploy AI systems have other teams \\nthat could take on second line responsibilities. This might include technical \\nsafety, ethics, policy, or governance teams. However, in practice, these teams \\nJonas Schuett \\n9 \\nrarely consider themselves as being responsible for risk management. This \\nneeds to be taken into account when implementing the 3LoD model (e.g. by \\nrunning workshops to sensitize them to their widened responsibility). In gen-\\neral, AI companies should arguably avoid assigning second line responsibili-\\nties to them. \\n3.3 Third line \\nThe third line is responsible for providing independent assurance. It assesses \\nthe work of the first two lines and reports any shortcomings to the governing \\nbody. \\n \\nWhile the second line already monitors and challenges the adequacy and \\neffectiveness of the risk management practices, the third line independently \\nassesses their work—they supervise the supervisors, so to speak. They could \\ndo this by conducting an internal audit (Raji et al, 2020) or commissioning an \\nexternal audit (Buolamwini & Gebru, 2018; Mökander & Floridi, 2022). Such \\naudits could have different purposes and scopes (Mökander et al., 2022). They \\ncould evaluate compliance with laws, standards, or ethics principles (“compli-\\nance audit”) or seek to identify new risks in a more open-ended fashion (“risk \\naudit”). They could also assess the model itself, including the dataset it was \\ntrained on (“model audit”), the model’s impact (“impact audit”), or the com-\\npany’s governance (“governance audit”). Similarly, the third line could engage \\na red team before or after a model is deployed to assess if the first two lines \\nwere able to identify all relevant risks (Ganguli et al., 2022; Perez et al., 2022). \\nFor example, before OpenAI released DALL·E 2, they asked a group of exter-\\nnal experts to identify ways in which the model can be misused (Mishkin et al., \\n2022). In addition to that, the third line should also review key policies and \\nprocesses to find flaws and vulnerabilities (e.g. ways in which a policy that \\nrequires researchers to assess the societal impact of a model can be circum-\\nvented [Ashurst et al., 2022]). Note that this should also include a meta-assess-\\nment of the company’s implementation of the 3LoD model itself. \\n \\nThe third line also supports the governing body, typically the board of di-\\nrectors, by providing independent and objective information about the com-\\npany’s risk management practices (IIA, 2020b). Their main audience is usually \\nthe audit committee, which is mainly composed of non-executive directors. But \\nsince non-executive directors only work part-time and heavily depend on the \\ninformation provided to them by the executives, they need an independent ally \\nin the company to effectively oversee the executives (Davies & Zhivitskaya, \\n2018). The third line serves this function by maintaining a high degree of in-\\ndependence from management and reporting directly to the governing body \\nfollowing best practices. It is often described as their “eyes and ears” (IIA, \\n2020a). \\n \\nThe third line has a well-defined organizational home: internal audit. Note \\nthat, in this context, internal audit refers to a specific organizational unit. It \\ndoes not merely mean an audit that is done internally (Raji et al, 2020). Instead, \\nThree lines of defense against risks from AI \\n10 \\nit means “those individuals operating independently from management to pro-\\nvide assurance and insight on the adequacy and effectiveness of governance \\nand the management of risk (including internal control)” (IIA, 2020a). \\n \\nTypically, companies have a dedicated internal audit team, led by the CAE \\nor Head of Internal Audit. Most big tech companies have such a team, but sim-\\nilar to the risk management team, they often neglect the societal risks from AI. \\nInstead of creating a separate AI-specific internal audit team, they should cre-\\nate a sub-team within their existing internal audit team, or simply task one or \\nmore team members to focus on AI-specific risk management activities. Me-\\ndium-sized research labs usually do not have an internal audit team. They \\nwould have to create a new team or task at least one person with third line \\nresponsibilities. In short, big tech companies need to “bring AI to internal au-\\ndit”, while research labs need to “bring internal audit to AI”. It is worth noting \\nthat, although there are promising developments (IIA, 2017a, 2017c), the pro-\\nfession of AI-specific internal auditors is still in its infancy. \\n \\nSome AI companies have an ethics board (e.g. Microsoft’s Aether Commit-\\ntee and Facebook’s Oversight Board) which could also take on third line re-\\nsponsibilities, typically in addition to internal audit. It would have to be organ-\\nizationally independent from management, but still be part of the organization \\n(in contrast to external assurance providers). If organizations already have an \\nindependent ethics board (e.g. consisting of representatives from academia and \\ncivil society), they could form a working group that takes on third line respon-\\nsibilities. \\n4 How the 3LoD model could help reduce risks from AI \\nWhile there are many reasons why AI companies may want to implement the \\n3LoD model, this section focuses on three arguments about the model’s ability \\nto prevent individual, collective, and societal harm: the model could help re-\\nduce risks from AI by identifying and closing gaps in risk coverage (Sec-\\ntion 4.1), increasing the effectiveness of risk management practices (Sec-\\ntion 4.2), and enabling the governing body to oversee management more effec-\\ntively (Section 4.3). I also give an overview of other benefits (Section 4.4). It \\nis worth noting that, in the absence of robust empirical evidence (see above), \\nthe following discussion remains theoretical and often relies on abstract plau-\\nsibility considerations. \\n4.1 Identifying and closing gaps in risk coverage \\nAI risk management involves different people from different teams with dif-\\nferent responsibilities (Baquero et al., 2020). If these responsibilities are not \\ncoordinated adequately, gaps in risk coverage can occur (Bantleon et al., 2021). \\nSuch gaps may have different causes. For example, it might be the case that no \\none is responsible for managing a specific risk (e.g. there could be a blind spot \\nJonas Schuett \\n11 \\nfor diffuse risks), or it might be unclear who is responsible (e.g. two teams \\nmight incorrectly assume that the other team already takes care of a risk). Gaps \\ncould also occur if the responsible person is not able to manage the risk effec-\\ntively (e.g. because they do not have the necessary expertise, information, or \\ntime). If a specific risk is not sufficiently covered by the risk management sys-\\ntem, it cannot be identified, which might result in an incorrect risk assessment \\n(e.g. the total risk of an unsafe AI system is judged acceptable) and an inade-\\nquate risk response (e.g. an unsafe AI system is deployed without sufficient \\nsafety precautions). \\n \\nThe 3LoD model could prevent this by identifying and closing gaps in risk \\ncoverage. It could do this by offering a systematic way to assign and coordinate \\nrisk management-related roles and responsibilities. It ensures that people who \\nare closest to the risk are responsible for risk management (first line) and get \\nthe support they need (second line). Another way the 3LoD model can help \\nidentify blindspots is through the internal audit function (third line). They are \\nresponsible for assessing the adequacy and effectiveness of the entire risk man-\\nagement regime, which includes potential gaps in risk coverage. \\n \\nOne might object that, in practice, gaps in risk coverage are rare, and even \\nif they occur, they only concern minor risks (e.g. because AI companies have \\nfound other ways to address the biggest risks). However, the AI Incident Da-\\ntabase (McGregor, 2021) contains numerous entries, including several cases \\nclassified as “moderate” or “severe”, which indicates that incidents are not that \\nuncommon. While these incidents had many different causes, it seems plausi-\\nble that at least some of them were related to gaps in risk coverage. But since \\nthere does not seem to be any public data on this, the issue remains speculative. \\n \\nEven if one thinks that gaps in risk coverage are a common problem among \\nAI companies, one might question the model’s ability to identify and close \\nthem. One might suspect that the people involved and their ability and willing-\\nness to identify gaps play a much bigger role. While it is certainly true that \\nimplementing the model alone is not sufficient, neither is having able and will-\\ning personnel. Both are necessary and only together can they be sufficient \\n(though other factors, such as information sharing between different organiza-\\ntional units, might also play a role). \\n \\nOverall, it seems likely that implementing the 3LoD model would help un-\\ncover some gaps in risk coverage that would otherwise remain unnoticed. \\n4.2 Increasing the effectiveness of risk management practices \\nSome risk management practices are ineffective—they might look good on pa-\\nper, but do not work in practice. AI companies might fail to identify relevant \\nrisks, misjudge their likelihood or severity, or be unable to reduce them to an \\nacceptable level. Ineffective risk management practices can have many differ-\\nent causes, such as reliance on a single measure (e.g. using a single taxonomy \\nto identify a wide range of risks), a failure to anticipate deliberate attempts to \\ncircumvent measures (e.g. stealing an unreleased model), a failure to anticipate \\nThree lines of defense against risks from AI \\n12 \\nrelevant changes in the risk landscape (e.g. the emergence of systemic risks \\ndue to the increasing reliance on so-called “foundation models” [Bommasani \\net al., 2021]), cognitive biases of risk managers (e.g. the availability bias, i.e. \\nthe tendency to “assess the frequency of a class or the probability of an event \\nby the ease with which instances or occurrences can be brought to mind” \\n[Tversky & Kahneman, 1974]), and other human errors (e.g. a person filling \\nout a risk register slips a line), among other things. \\n \\nThe 3LoD model can increase the effectiveness of risk management prac-\\ntices by identifying such shortcomings. As mentioned above, internal auditors \\nassess the effectiveness of risk management practices (e.g. via audits or red \\nteaming exercises) and report any shortcomings to the governing body, which \\ncan engage with management to improve these practices. \\n \\nOne might object that most shortcomings only occur in low-stakes situa-\\ntions. In high-stakes situations, existing risk management practices are already \\nmore effective. For example, AI companies often conduct extensive risk as-\\nsessments before deploying state-of-the-art models (Brundage et al., 2022; Ka-\\nvukcuoglu et al., 2022). While this might be true in obvious cases, there are \\nless obvious cases where practices might not be as effective as intended (e.g. \\nbecause they are insensitive to human errors or deliberate attempts to circum-\\nvent them). I would certainly not want to rely on the counterargument that the \\neffectiveness of risk management practices already scales sufficiently with the \\nstakes at hand. \\n \\nSome AI companies might further object that they already have the equiva-\\nlent of an internal audit function, so implementing the 3LoD would only be a \\nmarginal improvement. While it might be true that some people at some com-\\npanies perform some tasks that are similar to what internal auditors do, to the \\nbest of my knowledge, assessing the effectiveness of risk management prac-\\ntices is not their main responsibility and they do not follow best practices from \\nthe internal audit profession, such as being organizationally independent from \\nmanagement (IIA, 2017b), which can lead to noticeable differences. \\n \\nOverall, I think this is one of the best arguments for implementing the 3LoD \\nmodel. Without a serious attempt to identify ineffective risk management prac-\\ntices, I expect at least some shortcomings to remain unnoticed. The degree to \\nwhich this is true mainly depends on internal audit’s ability and willingness to \\nserve this function. \\n4.3 Enabling the governing body to oversee management more effectively \\nThe governing body, typically the board of directors, is responsible for over-\\nseeing management. To do this, they need independent and objective infor-\\nmation about the company’s risk management practices. However, they heav-\\nily rely on information provided to them by the executives. To effectively over-\\nsee the executives, they need an independent ally in the company. \\n \\nInternal audit serves this function by maintaining a high degree of independ-\\nence from management and reporting directly to the audit committee of the \\nJonas Schuett \\n13 \\nboard. This can be important because, compared to other actors, the board has \\nsignificant influence over management. For example, they can replace the CEO \\n(e.g. if they repeatedly prioritize profits over safety), make strategic decisions \\n(e.g. blocking a strategic partnership with the military), and make changes to \\nthe company’s risk governance (e.g. setting up an ethics board). Note that there \\nis a complementary reporting line from the CRO to the risk committee of the \\nboard. \\n \\nOne might object that this function could also be served by other actors. For \\nexample, third-party auditors could also provide the board with independent \\nand objective information. While external audits can certainly play an im-\\nportant role, they have several disadvantages compared to internal audits: they \\nmight lack important context, companies might not want to share sensitive in-\\nformation with them (e.g. about ongoing research projects), and audits are typ-\\nically only snapshots in time. AI companies should therefore see external audit \\nas a complement to internal audit, not a substitution. There is a reason why the \\n3LoD model distinguishes between internal audit and external assurance pro-\\nviders. \\n \\nOne might further point out that in other industries, internal audit is often \\nperceived to intervene too late (Davies & Zhivitskaya, 2018) and to team up \\nwith management, instead of monitoring them (Roussy & Rodrigue, 2018). \\nThis would indeed be problematic. However, as discussed above, this does not \\nseem to be an inherent property of internal audit. Instead, it seems to be mainly \\ndriven by the particular way it is set up and the people involved. Having said \\nthat, AI companies should take this concern seriously and take measures to \\naddress it. \\n \\nOverall, I think that implementing the 3LoD model can significantly in-\\ncrease the board’s information base. This effect will be more noticeable at me-\\ndium-sized research labs, as most big tech companies already have an internal \\naudit function, albeit not an AI-specific one (see above). \\n4.4 Other benefits \\nImplementing the 3LoD model has many benefits other than reducing risks to \\nindividuals, groups, or society. Although these other benefits are beyond the \\nscope of this article, it seems warranted to at least give an overview. Below, I \\nbriefly discuss four of them. \\n \\nFirst, implementing the 3LoD model can avoid unnecessary duplications of \\nrisk coverage. Different people in different teams could be doing the same or \\nvery similar risk management work. This is often desirable because it can pre-\\nvent gaps in risk coverage (see above). But if such duplications are not neces-\\nsary, they can waste resources, such as labor, that could be used more produc-\\ntively elsewhere. AI companies therefore face an effectiveness-efficiency-\\ntradeoff. How this tradeoff ought to be resolved, depends on the particular con-\\ntext. For example, when dealing with catastrophic risks, effectiveness (prevent-\\ning gaps in risk coverage) seems more important than efficiency (avoiding \\nThree lines of defense against risks from AI \\n14 \\nunnecessary duplications of coverage). In this case, AI companies should \\nstrictly err on the side of too much coverage rather than risk gaps in important \\nareas. Overall, this benefit seems to be overstated and less relevant if one is \\nmainly concerned with risk reduction. \\n \\nSecond, AI companies that have implemented the 3LoD model might be \\nperceived as being more responsible. In general, risk management practices at \\nAI companies seem less advanced compared to many other industries (e.g. avi-\\nation or banking). By adapting existing best practices from other industries, \\nthey would signal that they aim to further professionalize their risk manage-\\nment practices, which could be perceived as being more responsible. This per-\\nception might have a number of benefits. For example, it could make it easier \\nto attract and retain talent that cares about ethics and safety. It could also help \\navoid overly burdensome measures from regulators. It might even be beneficial \\nin litigation cases for the question of whether or not an organization has ful-\\nfilled its duty of care. However, it seems questionable whether implementing \\nthe 3LoD model affects perception that much, especially compared to other \\ngovernance measures (e.g. publishing AI ethics principles or setting up an AI \\nethics board), mainly because most stakeholders, including most employees, \\ndo not know the model and cannot assess its relevance. An exception might be \\nregulators and courts who care more about the details of risk management prac-\\ntices. My best guess is that implementing the model will have noticeable effects \\non the perception of a few stakeholders, while most other stakeholders will not \\ncare. \\n \\nThird, implementing the 3LoD model can make it easier to hire risk man-\\nagement talent. The profession of AI risk management is in its infancy. I as-\\nsume that AI companies find it challenging to hire people with AI and risk \\nmanagement expertise. In most cases, they can either hire AI experts and train \\nthem in risk management, or hire risk management experts from other indus-\\ntries and train them in AI. Implementing the 3LoD model could make it easier \\nto hire risk management experts from other industries because they would al-\\nready be familiar with the model. This might become more important if one \\nassumes that AI companies will want to hire more risk management talent as \\nsystems get more capable and are used in more safety-critical situations (e.g. \\nDegrave et al., 2022). However, I do not find this argument very convincing. I \\ndoubt that implementing the 3LoD model would make a meaningful difference \\non relevant hiring decisions (e.g. on a candidate’s decision to apply or accept \\nan offer). Since the model is about the organizational dimension of risk man-\\nagement, it does not have significant effects on the day-to-day risk manage-\\nment work. Having said that, there might be smaller benefits (e.g. making the \\nonboarding process easier). My best guess is that the counterfactual impact of \\n3LoD implementation on hiring is low. \\n \\nFourth, implementing the 3LoD model might reduce financing costs. Rating \\nagencies tend to give better ratings to companies that have implemented an \\nERM framework (because doing so is considered best practice), and companies \\nwith better ratings tend to have lower financing costs (because they get better \\nJonas Schuett \\n15 \\ncredit conditions) (see Bohnert et al., 2019). There might be an analogous ef-\\nfect with regards to the implementation of the 3LoD model. Lower financing \\ncosts are particularly important if one assumes that the costs for developing \\nstate-of-the-art AI systems will increase because of increasing demand for \\ncompute (Sevilla et al., 2022), for example. In scenarios where commercial \\npressure is much higher than today, lower financing costs could also be im-\\nportant to continue safety research that does not contribute to product develop-\\nment. That said, I am uncertain to what extent the findings for ERM frame-\\nworks generalize to the 3LoD model. My best guess is that implementing the \\n3LoD would not have meaningful effects on the financing costs of medium-\\nsized research labs today. But I expect this to change as labs become more \\nprofitable and increasingly make use of other funding sources (e.g. credits or \\nbonds). \\n5 Conclusion \\nThis article has applied the 3LoD model to an AI context. It has suggested \\nconcrete ways in which medium-sized research labs like DeepMind and \\nOpenAI or big tech companies like Google and Microsoft could implement the \\nmodel to reduce risks from AI. It has argued that implementing the model could \\nprevent individual, collective, or societal harm by identifying and closing gaps \\nin risk coverage, increasing the effectiveness of risk management practices, \\nand enabling the governing body to oversee management more effectively. It \\nconcluded that, while there are some limitations and the effects should not be \\noverstated, the model can plausibly contribute to a reduction of risks from AI. \\n \\nBased on the findings of this article, I suggest the following questions for \\nfurther research. First, the article has highlighted the importance of internal \\naudit in AI risk management. But since there has not been much AI-specific \\nwork on internal audit, it would be valuable to review best practices from other \\nindustries (e.g. BCBS, 2012) and discuss the extent to which these practices \\nare applicable to an AI context. Second, my discussion of the model’s ability \\nto reduce risks from AI was mostly theoretical and relied on abstract plausibil-\\nity considerations. I encourage other scholars to assess these claims empiri-\\ncally. An industry case study similar to the one that Mökander and Floridi \\n(2022) conducted for ethics-based auditing could be a first step. Third, alt-\\nhough AI companies have not implemented the 3LoD model, they already per-\\nform many of the above-mentioned activities. To better target future work, it \\nwould be helpful to review existing risk management practices at leading AI \\ncompanies and conduct a gap analysis. Since public data is scarce, scholars \\nwould have to conduct interviews or surveys (e.g. an “AI risk management \\nbenchmark survey”), though I expect confidentiality to be a major obstacle. \\nFourth, the article has focused on the voluntary adoption of the 3LoD model. \\nIt would be important to know if existing or future regulations might even \\nThree lines of defense against risks from AI \\n16 \\nrequire AI companies to implement the model. For example, while Article 9 of \\nthe proposed EU AI Act does not mention the 3LoD model, it has been sug-\\ngested that future harmonized standards or common specifications should in-\\nclude the model (Schuett, 2022). The 3LoD model is also mentioned in the \\nplaybook that accompanies the NIST AI Risk Management Framework (NIST, \\n2022a, 2022b). It is conceivable that this framework will be translated into US \\nlaw, similar to the NIST Framework for Improving Critical Infrastructure Cy-\\nbersecurity (NIST, 2018). Finally, the article has investigated the 3LoD in iso-\\nlation. It has excluded contextual factors, such as the risk culture at AI compa-\\nnies, which might also affect the model’s effectiveness. A better understanding \\nof these factors would further improve the information base for decision-mak-\\ners at AI companies and beyond. \\n \\nAs famously put by George Box (1976), “all models are wrong, but some \\nare useful”. In the same spirit, one might say that the 3LoD model is not a silver \\nbullet against the risks from AI, but it can still play an important role. AI com-\\npanies should see it as one of many governance tools they can use to tackle \\ntoday’s and tomorrow’s threats from AI. \\nAcknowledgements \\nI am grateful for valuable comments and feedback from Leonie Koessler, \\nJames Ginns, Markus Anderljung, Andre Barbe, Noemie Dreksler, Toby \\nShevelane, Anne le Roux, Alexis Carlier, Emma Bluemke, Christoph Winter, \\nRenan Araújo, José Jaime Villalobos, Suzanne Van Arsdale, Alfredo Parra, \\nand Nick Hollman. All remaining errors are my own. \\nReferences \\nAlphabet. (2022). Notice of 2022 annual meeting of stockholders and proxy statement. SEC. \\nhttps://perma.cc/Q23E-WQWP \\nAndersen, T. J., Sax, J., & Giannozzi, A. (2022). Conjoint effects of interacting strategy-\\nmaking processes and lines of defense practices in strategic risk management: An empi-\\nrical \\nstudy. \\nLong \\nRange \\nPlanning, \\n55(6), \\n102164. \\nhttps://doi.org/10.1016/j.lrp.2021.102164 \\nArndorfer, I., & Minto, A. (2015). The “four lines of defence model” for financial instituti-\\nons. \\nFinancial \\nStability \\nInstitute, \\nBank \\nfor \\nInternational \\nSettlements. \\nhttps://perma.cc/UP35-KEYJ \\nAshurst, C., Hine, E., Sedille, P., & Carlier, A. (2022). AI ethics statements: Analysis and \\nlessons learnt from NeurIPS broader impact statements. 2022 ACM Conference on Fair-\\nness, \\nAccountability, \\nand \\nTransparency, \\n2047–2056. \\nhttps://doi.org/10.1145/3531146.3533780 \\nJonas Schuett \\n17 \\nBantleon, U., d’Arcy, A., Eulerich, M., Hucke, A., Pedell, B., & Ratzinger-Sakel, N. V. S. \\n(2021). Coordination challenges in implementing the three lines of defense model. Inter-\\nnational Journal of Auditing, 25(1), 59–74. https://doi.org/10.1111/ijau.12201 \\nBaquero, J. A., Burkhardt, R., Govindarajan, A., & Wallace, T. (2020). Derisking AI by \\ndesign: How to build risk management into AI development. McKinsey. \\nhttps://perma.cc/2WPN-A6CW \\nBCBS. (1999). Enhancing corporate governance for banking organisations. Bank for Inter-\\nnational Settlements. https://perma.cc/G2QP-7K5B \\nBCBS. (2012). The internal audit function in banks. Bank for International Settlements. \\nhttps://perma.cc/A57Q-8LZ6 \\nBoatright, J. (2016). Why risk management failed: Ethical and behavioral aspects. In A. G. \\nMalliaris, L. Shaw, & H. Shefrin (Eds.), The global financial crisis and its aftermath: \\nHidden \\nfactors \\nin \\nthe \\nmeltdown \\n(pp. 384–386). \\nOxford \\nUniversity \\nPress. \\nhttps://doi.org/10.1093/acprof:oso/9780199386222.003.0017 \\nBohnert, A., Gatzert, N., Hoyt, R. E., & Lechner, P. (2019). The drivers and value of enter-\\nprise risk management: Evidence from ERM ratings. The European Journal of Finance, \\n25(3), 234–255. https://doi.org/10.1080/1351847X.2018.1514314 \\nBommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. \\nS., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, \\nR., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., … Liang, P. (2021). On \\nthe opportunities and risks of foundation models. arXiv. https://doi.org/10.48550/ar-\\nXiv.2108.07258 \\nBoţa-Avram, C., & Palfi, C. (2009). Measuring and assessment of internal audit’s effec-\\ntiveness. Annals of Faculty of Economics, University of Oradea, 3(1), 784–790. \\nhttps://perma.cc/N5JC-J79R \\nBox, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, \\n71(356), 791–799. https://doi.org/10.1080/01621459.1976.10480949 \\nBrundage, M., Mayer, K., Eloundou, T., Agarwal, S., Adler, S., Krueger, G., Leike, J., & \\nMishkin, P. (2022, March 3). Lessons learned on language model safety and misuse. \\nOpenAI. https://perma.cc/8RKR-QJZY \\nBuolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in \\ncommercial gender classification. Proceedings of the 1st Conference on Fairness, Ac-\\ncountability and Transparency, 77–91. https://perma.cc/976J-AR93 \\nCarcello, J. V., Eulerich, M., Masli, A., & Wood, D. A. (2020). Are internal audits associated \\nwith reductions in perceived risk? Auditing: A Journal of Practice & Theory, 39(3), 55–\\n73. https://doi.org/10.2308/ajpt-19-036 \\nCheatham, B., Javanmardian, K., & Samandari, H. (2019). Confronting the risks of artificial \\nintelligence. McKinsey. https://perma.cc/T2CX-HYZF \\nClarke, S., & Whittlestone, J. (2022). A survey of the potential long-term impacts of AI: \\nHow AI could lead to long-term changes in science, cooperation, power, epistemics and \\nvalues. Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 192–\\n202. https://doi.org/10.1145/3514094.3534131 \\nCoram, P., Ferguson, C., & Moroney, R. (2008). Internal audit, alternative internal audit \\nstructures and the level of misappropriation of assets fraud. Accounting and Finance, \\n48(4), 543–559. https://doi.org/10.1111/j.1467-629X.2007.00247.x \\nCOSO. (2017). Enterprise risk management — Integrating with strategy and performance. \\nhttps://perma.cc/5Z3G-KD6R \\nThree lines of defense against risks from AI \\n18 \\nCrafts, N. (2021). Artificial intelligence as a general-purpose technology: An historical per-\\nspective. Oxford Review of Economic Policy, 37(3), 521–536. https://doi.org/10.1093/ox-\\nrep/grab012 \\nD'Onza, G., Selim, G. M., Melville, R., & Allegrini, M. (2015). A study on internal auditor \\nperceptions of the function ability to add value. International Journal of Auditing, 19(3), \\n182–194. https://doi.org/10.1111/ijau.12048 \\nDavies, H., & Zhivitskaya, M. (2018). Three lines of defence: A robust organising frame-\\nwork, \\nor \\njust \\nlines \\nin \\nthe \\nsand? \\nGlobal \\nPolicy, \\n9(S1), \\n34–42. \\nhttps://doi.org/10.1111/1758-5899.12568 \\nDegrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, \\nR., Abdolmaleki, A., de las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., \\nKeeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., … Riedmiller, M. (2022). \\nMagnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602, \\n414–419. https://doi.org/10.1038/s41586-021-04301-9 \\nDrogalas, G., Pazarskis, M., Anagnostopoulou, E., & Papachristou, A. (2017). The effect of \\ninternal audit effectiveness, auditor responsibility and training in fraud detection. Journal \\nof \\nAccounting \\nand \\nManagement \\nInformation \\nSystems \\n16(4), \\n434–454. \\nhttps://doi.org/10.24818/jamis.2017.04001 \\nEulerich, A., & Eulerich, M. (2020). What is the value of internal auditing? A literature \\nreview on qualitative and quantitative perspectives. Maandblad Voor Accountancy En \\nBedrijfseconomie, 94(3/4), 83–92. https://doi.org/10.5117/mab.94.50375 \\nEuropean Banking Authority. (2021). Final report on guidelines on internal governance \\nunder Directive 2013/36/EU (EBA/GL/2021/05). https://perma.cc/RCD8-V99V \\nEuropean Commission. (2021). Proposal for a regulation laying down harmonized rules on \\nartificial \\nintelligence \\n(Artificial \\nIntelligence \\nAct) \\n(COM(2021) \\n206 \\nfinal). \\nhttps://perma.cc/4YXM-38U9 \\nEvans, O., Cotton-Barratt, O., Finnveden, L., Bales, A., Balwit, A., Wills, P., Righetti, L., \\n& Saunders, W. (2021). Truthful AI: Developing and governing AI that does not lie. ar-\\nXiv. https://doi.org/10.48550/arXiv.2110.06674 \\nFinancial Services Authority. (2003). Building a framework for operational risk manage-\\nment: The FSA’s observations. https://perma.cc/5AX2-M2LF \\nGanguli, D., Lovitt, L., Kernion, J., Askell, A., Bai, Y., Kadavath, S., Mann, B., Perez, E., \\nSchiefer, N., Ndousse, K., Jones, A., Bowman, S., Chen, A., Conerly, T., DasSarma, N., \\nDrain, D., Elhage, N., El-Showk, S., Fort, S., … Clark, J. (2022). Red teaming language \\nmodels to reduce harms: Methods, scaling behaviors, and lessons learned. arXiv. \\nhttps://doi.org/10.48550/arXiv.2209.07858 \\nGebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumé III, H., & \\nCrawford, K. (2021). Datasheets for datasets. arXiv. https://doi.org/10.48550/ar-\\nXiv.1803.09010 \\nGreen, N., Procope, C., Cheema, A., & Adediji, A. (2022, February 23). System cards, a new \\nresource for understanding how AI systems work. Meta AI. https://perma.cc/CQZ8-FQ44 \\nGyengo, J., & Bruner, M. (2022). Scenario planning and wargaming: What they are and \\nwhy we need them in today’s uncertain future. KPMG. https://perma.cc/BT67-VCM9 \\nHamon, R., Junklewitz, H., Sanchez, I., Malgieri, G., & De Hert, P. (2022). Bridging the gap \\nbetween AI and explainability in the GDPR: Towards trustworthiness-by-design in auto-\\nmated decision-making. IEEE Computational Intelligence Magazine, 17(1), 72–85. \\nhttps://doi.org/10.1109/MCI.2021.3129960 \\nHoefer, E., Cooke, M., & Curry, T. (2020, September 8). Three lines of defense: Failed \\npromises and what comes next. Reuters. https://perma.cc/V35X-VTC5 \\nJonas Schuett \\n19 \\nHua, S.-S., & Belfield, H. (2021). AI & antitrust: Reconciling tensions between competition \\nlaw and cooperative AI development. Yale Journal of Law & Technology, 23(415). \\nhttps://perma.cc/4VL2-QNNJ \\nHuibers, S. C. J. (2015). Combined assurance: One language, one voice, one view. IIA Re-\\nsearch \\nFoundation, \\nGlobal \\nInternal \\nAudit \\nCommon \\nBody \\nof \\nKnowledge. \\nhttps://perma.cc/D7YM-9GSY \\nIEC. (2019). Risk management — Risk assessment techniques (IEC Standard \\nNo. 31010:2019) https://perma.cc/5CDZ-GU9R \\nIIA. (2013). IIA position paper: The three lines of defense in effective risk management and \\ncontrol. https://perma.cc/NQM2-DD7V \\nIIA. (2017a). Artificial intelligence: Considerations for the profession of internal auditing \\n(Part I). https://perma.cc/K8WQ-VNFZ \\nIIA. (2017b). International standards for the professional practice of internal auditing. \\nhttps://perma.cc/AKU7-8YWZ \\nIIA. (2017c). The IIA’s artificial intelligence auditing framework: Practical applications \\n(Part A). https://perma.cc/U93U-LN75 \\nIIA. (2018). The IIA’s artificial intelligence auditing framework. Practical applications \\n(Part B). https://perma.cc/826X-Y3L7 \\nIIA. (2020a). The IIA’s three lines model: An update of the three lines of defense. \\nhttps://perma.cc/GAB5-DMN3 \\nIIA. (2020b). Good practice internal audit reports. https://perma.cc/7BQT-DTRD \\nISO. (2018). Risk management — Guidelines (ISO Standard No. 31000:2018). \\nhttps://perma.cc/76EJ-GATT \\nISO & IEC. (2022). Information security, cybersecurity and privacy protection — Informa-\\ntion security management systems — Requirements (ISO/IEC Standard No. 27001:2022). \\nhttps://perma.cc/Y237-ZMG2 \\nISO & IEC. (n.d.). Information technology — Artificial intelligence — Guidance on risk \\nmanagement (ISO/IEC Standard No. 23894). https://perma.cc/GL3Q-WHQ2 \\nJiang, L., Messier, W. F., & Wood, D. A. (2020). The association between internal audit \\noperations-related services and firm operating performance. Auditing: A Journal of Prac-\\ntice & Theory, 39(1), 101–124. https://doi.org/10.2308/ajpt-52565 \\nJobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. \\nNature Machine Intelligence, 1, 389–399. https://doi.org/10.1038/s42256-019-0088-2 \\nKavukcuoglu, K., Kohli, P., Ibrahim, L., Bloxwich, D., & Brown, S. (2022, September 14). \\nHow \\nour \\nprinciples \\nhelped \\ndefine \\nAlphaFold’s \\nrelease. \\nDeepMind. \\nhttps://perma.cc/3ARS-XLNV \\nKleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair de-\\ntermination of risk scores. arXiv. https://doi.org/10.48550/arXiv.1609.05807 \\nKräussl, R. (2003). A critique on the proposed use of external sovereign credit ratings in \\nBasel II. \\nCenter \\nfor \\nFinancial \\nStudies, \\nGoethe \\nUniversity \\nFrankfurt. \\nhttps://perma.cc/PMB8-WSCA \\nLeech, T. J., & Hanlon, L. C. (2016). Three lines of defense versus five lines of assurance: \\nElevating the role of the board and CEO in risk governance. In R. Leblanc (Ed.), The \\nhandbook of board governance: A comprehensive guide for public, private and not-for-\\nprofit \\nboard \\nmembers \\n(pp. 335–355). \\nWiley. \\nhttps://doi.org/10.1002/9781119245445.ch17 \\nLenz, R., & Hahn, U. (2015) A synthesis of empirical internal audit effectiveness literature \\npointing to new research opportunities. Managerial Auditing Journal, 30(1), 5–33. \\nhttps://doi.org/10.1108/MAJ-08-2014-1072 \\nThree lines of defense against risks from AI \\n20 \\nLin, S., Pizzini, M., Vargus, M., & Bardhan, I. R. (2011). The role of the internal audit \\nfunction in the disclosure of material weaknesses. The Accounting Review, 86(1), 287–\\n323. https://doi.org/10.2308/accr.00000016 \\nMa’ayan, Y., & Carmeli, A. (2016). Internal audits as a source of ethical behavior, effi-\\nciency, and effectiveness in work units. Journal of Business Ethics, 137(2), 347–363. \\nhttps://doi.org/10.1007/s10551-015-2561-0 \\nMcGregor, S. (2021). Preventing repeated real world AI failures by cataloging incidents: \\nThe AI incident database. Proceedings of the AAAI Conference on Artificial Intelligence, \\n35(17), 15458–15463. https://doi.org/10.1609/aaai.v35i17.17817 \\nMicrosoft. (2020). Assessing harm: A guide for tech builders. https://perma.cc/PV3E-HL23 \\nMicrosoft. (2022). Notice of annual shareholders meeting and proxy statement 2022. SEC. \\nhttps://perma.cc/6NYQ-ZTMB \\nMishkin, P., Ahmad, L., Brundage, M., Krueger, G., & Sastry, G. (2022). DALL·E 2 preview: \\nRisks and limitations. GitHub. https://perma.cc/X467-47PX \\nMitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., Spitzer, E., \\nRaji, I. D., & Gebru, T. (2019). Model cards for model reporting. Proceedings of the \\nConference \\non \\nFairness, \\nAccountability, \\nand \\nTransparency, \\n220–229. \\nhttps://doi.org/10.1145/3287560.3287596 \\nMökander, J., & Floridi, L. (2022). Operationalising AI governance through ethics-based \\nauditing: An industry case study. AI and Ethics. https://doi.org/10.1007/s43681-022-\\n00171-7 \\nMökander, J., Schuett, J., Kirk, H. R., Floridi, L. (2022). Auditing large language models \\n[Unpublished manuscript]. \\nNasdaq. \\n(2022). \\nNasdaq \\n5600 \\nseries: \\nCorporate \\ngovernance \\nrequirements. \\nhttps://perma.cc/4M7B-U42F \\nNIST. (2018). Framework for improving critical infrastructure cybersecurity (Version 1.1). \\nhttps://doi.org/10.6028/NIST.CSWP.04162018 \\nNIST. (2022a). NIST AI risk management framework: Playbook (Govern 4.1). \\nhttps://perma.cc/T7MQ-MVJ7 \\nNIST. (2022b). NIST AI risk management framework: Second draft. https://perma.cc/6EJ9-\\nUZ9A \\nNunn, R. (2020). Discrimination in the age of algorithms. In W. Barfield (Ed.), The \\nCambridge handbook of the law of algorithms (pp. 182–198). Cambridge University \\nPress. https://doi.org/10.1017/9781108680844.010 \\nOrd, T. (2021). Proposal for a new ‘three lines of defence’ approach to UK risk management. \\nFuture of Humanity Institute, University of Oxford. https://perma.cc/VHH9-L36R \\nOussii, A. A., & Boulila Taktak, N. (2018). The impact of internal audit function charac-\\nteristics on internal control quality. Managerial Auditing Journal, 33(5), 450–469. \\nhttps://doi.org/10.1108/MAJ-06-2017-1579 \\nPartnership on AI. (2021). Managing the risks of AI research: Six recommendations for \\nresponsible publication. https://perma.cc/BX5A-KE8D \\nPerez, E., Huang, S., Song, F., Cai, T., Ring, R., Aslanides, J., Glaese, A., McAleese, N., & \\nIrving, G. (2022). Red teaming language models with language models. arXiv. \\nhttps://doi.org/10.48550/arXiv.2202.03286 \\nPower, M., Ashby, S., & Palermo, T. (2013). Risk culture in financial organisations: A re-\\nsearch \\nreport. \\nThe \\nLondon \\nSchool \\nof \\nEconomics \\nand \\nPolitical \\nScience. \\nhttps://perma.cc/R9YC-AT4Z \\nJonas Schuett \\n21 \\nRaji, I. D., Kumar, I. E., Horowitz, A., & Selbst, A. (2022). The fallacy of AI functionality. \\n2022 ACM Conference on Fairness, Accountability, and Transparency, 959–972. \\nhttps://doi.org/10.1145/3531146.3533158 \\nRaji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, \\nJ., Theron, D., & Barnes, P. (2020). Closing the AI accountability gap: Defining an end-\\nto-end framework for internal algorithmic auditing. arXiv. https://doi.org/10.48550/ar-\\nXiv.2001.00973 \\nRando, J., Paleka, D., Lindner, D., Heim, L., & Tramèr, F. (2022). Red-teaming the stable \\ndiffusion safety filter. arXiv. https://doi.org/10.48550/arXiv.2210.04610 \\nRao, A., & Golbin, I. (2021, January 11). Top-down and end-to-end governance for the \\nresponsible use of AI. Medium. https://perma.cc/SM8Y-6CUN \\nRoussy, M., & Rodrigue, M. (2018). Internal audit: Is the ‘third line of defense’ effective as \\na form of governance? An exploratory study of the impression management techniques \\nchief audit executives use in their annual accountability to the audit committee. Journal \\nof Business Ethics, 151, 853–869. https://doi.org/10.1007/s10551-016-3263-y \\nRupšys, R., & Boguslauskas, V. (2007). Measuring performance of internal auditing: Empi-\\nrical evidence. Engineering Economics, 55(5): 9–15. https://perma.cc/RA8D-35B3 \\nSavčuk, O. (2007). Internal audit efficiency evaluation principles. Journal of Business Eco-\\nnomics & Management, 8(4), 275–284. https://doi.org/10.3846/16111699.2007.9636180 \\nSchuett, J. (2022). Risk management in the Artificial Intelligence Act. arXiv. \\nhttps://doi.org/10.48550/arXiv.2212.03109 \\nSekar, M. (2022). Machine learning for auditors: Automating fraud investigations through \\nartificial intelligence. Apress. https://doi.org/10.1007/978-1-4842-8051-5 \\nSevilla, J., Heim, L., Ho, A., Besiroglu, T., Hobbhahn, M., & Villalobos, P. (2022). Compute \\ntrends across three eras of machine learning. arXiv. https://doi.org/10.48550/ar-\\nXiv.2202.05924 \\nShevlane, T. (2022). Structured access: An emerging paradigm for safe AI deployment. In \\nJ. B. Bullock, Y.-C. Chen, J. Himmelreich, V. M. Hudson, A. Korinek, M. M. Young, & \\nB. Zhang (Eds.), The Oxford handbook of AI governance. Oxford University Press. \\nhttps://doi.org/10.1093/oxfordhb/9780197579329.013.39 \\nSmuha, N. A. (2021). Beyond the individual: Governing AI’s societal harm. Internet Policy \\nReview, 10(3). https://doi.org/10.14763/2021.3.1574 \\nSolaiman, I., & Dennison, C. (2021). Process for adapting language models to society \\n(PALMS) with values-targeted datasets. Advances in Neural Information Processing Sys-\\ntems 34, 5861–5873. https://perma.cc/TF3N-SBPS \\nSolaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., Radford, A., \\nKrueger, G., Kim, J. W., Kreps, S., McCain, M., Newhouse, A., Blazakis, J., McGuffie, \\nK., & Wang, J. (2019). Release strategies and the social impacts of language models. \\narXiv. https://doi.org/10.48550/arXiv.1908.09203 \\nTammenga, A. (2020). The application of artificial intelligence in banks in the context of \\nthe three lines of defence model. Maandblad Voor Accountancy En Bedrijfseconomie, \\n94(5/6), 219–230. https://doi.org/10.5117/mab.94.47158 \\nTversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. \\nScience, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124 \\nVousinas, G. L. (2021). Beyond the three lines of defense: The five lines of defense model \\nfor financial institutions. ACRN Journal of Finance and Risk Perspectives, 10(1), 95–\\n110. https://doi.org/10.35944/jofrp.2021.10.1.006 \\nThree lines of defense against risks from AI \\n22 \\nWachter, S., Mittelstadt, B., & Russell, C. (2021). Why fairness cannot be automated: \\nBridging the gap between EU non-discrimination law and AI. Computer Law & Security \\nReview, 41, 105567. https://doi.org/10.1016/j.clsr.2021.105567 \\nWalker, D. (2009). A review of corporate governance in UK banks and other financial in-\\ndustry entities: Final recommendations. https://perma.cc/2K9C-EMME \\nWeidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P.-S., Cheng, M., Glaese, \\nM., Balle, B., Kasirzadeh, A., Kenton, Z., Brown, S., Hawkins, W., Stepleton, T., Biles, \\nC., Birhane, A., Haas, J., Rimell, L., Hendricks, L. A., … Gabriel, I. (2021). Ethical and \\nsocial risks of harm from language models. arXiv. https://doi.org/10.48550/ar-\\nXiv.2112.04359 \\nZhivitskaya, M. (2015). The practice of risk oversight since the global financial crisis: Clo-\\nsing the stable door? [Doctoral dissertation, The London School of Economics and Poli-\\ntical Science]. LSE Theses Online. https://perma.cc/KKA6-QK56 \\n\\n\"}\n", "=================================\u001B[1m Tool Message \u001B[0m=================================\n", "Name: arvix_search\n", "\n", "{\"arvix_results\": \"\\nCorrelations of consumption patterns in social-economic\\nnetworks\\nYannick Leo1, M´arton Karsai1,*, Carlos Sarraute2 and Eric Fleury1\\n1Univ Lyon, ENS de Lyon, Inria, CNRS, UCB Lyon 1, LIP UMR 5668, IXXI, F-69342, Lyon, France\\n2Grandata Labs, Bartolome Cruz 1818 V. Lopez. Buenos Aires, Argentina\\n*Corresponding author: marton.karsai@ens-lyon.fr\\nAbstract\\nWe analyze a coupled anonymized dataset collecting the\\nmobile phone communication and bank transactions his-\\ntory of a large number of individuals.\\nAfter mapping\\nthe social structure and introducing indicators of socioe-\\nconomic status, demographic features, and purchasing\\nhabits of individuals we show that typical consumption\\npatterns are strongly correlated with identified socioe-\\nconomic classes leading to patterns of stratification in\\nthe social structure.\\nIn addition we measure correla-\\ntions between merchant categories and introduce a cor-\\nrelation network, which emerges with a meaningful com-\\nmunity structure.\\nWe detect multivariate relations be-\\ntween merchant categories and show correlations in pur-\\nchasing habits of individuals. Our work provides novel\\nand detailed insight into the relations between social and\\nconsuming behaviour with potential applications in rec-\\nommendation system design.\\n1\\nIntroduction\\nThe consumption of goods and services is a cru-\\ncial element of human welfare.\\nThe uneven dis-\\ntribution of consumption power among individuals\\ngoes hand in hand with the emergence and reserva-\\ntion of socioeconomic inequalities in general.\\nIndi-\\nvidual financial capacities restrict personal consumer\\nbehaviour, arguably correlate with one’s purchas-\\ning preferences, and play indisputable roles in deter-\\nmining the socioeconomic position of an ego in the\\nlarger society [1, 2, 3, 4, 5].\\nInvestigation of rela-\\ntions between these characters carries a great poten-\\ntial in understanding better rational social-economic\\nbehaviour [6], and project to direct applications in\\npersonal marketing, recommendation, and advertis-\\ning.\\nSocial\\nNetwork\\nAnalysis\\n(SNA)\\nprovides\\none\\npromising direction to explore such problems [7], due\\nto its enormous benefit from the massive flow of hu-\\nman behavioural data provided by the digital data\\nrevolution [8].\\nThe advent of this era was propa-\\ngated by some new data collection techniques, which\\nallowed the recording of the digital footprints and in-\\nteraction dynamics of millions of individuals [9, 10].\\nOn the other hand, although social behavioural data\\nbrought us detailed knowledge about the structure\\nand dynamics of social interactions, it commonly\\nfailed to uncover the relationship between social and\\neconomic positions of individuals. Nevertheless, such\\ncorrelations play important roles in determining one’s\\nsocioeconomic status (SES) [11], social tie formation\\npreferences due to status homophily [12, 13], and in\\nturn potentially stand behind the emergent stratified\\nstructure and segregation on the society level [4, 14].\\nHowever until now, the coupled investigation of indi-\\nvidual social and economic status remained a great\\nchallenge due to lack of appropriate data recording\\nsuch details simultaneously.\\nAs individual economic status restricts one’s capac-\\nity in purchasing goods and services, it induces diver-\\ngent consumption patterns between people at differ-\\nent socioeconomic positions [6, 1, 2]. This is reflected\\nby sets of commonly purchased products, which are\\nfurther associated to one’s social status [15]. Con-\\nsumption behaviour has been addressed from vari-\\nous angles considering e.g. environmental effects, so-\\ncioeconomic position, or social influence coming from\\nconnected peers [1]. However, large data-driven stud-\\nies combining information about individual purchas-\\ning and interaction patterns in a society large pop-\\nulation are still rare, although questions about cor-\\nrelations between consumption and social behaviour\\n1\\narXiv:1609.03756v2 [cs.SI] 21 Dec 2017\\nare of utmost interest.\\nIn this study we address these crucial problems\\nvia the analysis of a dataset,\\nwhich simultane-\\nously records the mobile-phone communication, bank\\ntransaction history, and purchase sequences of mil-\\nlions of inhabitants of a single country over several\\nmonths.\\nThis corpus, one among the firsts at this\\nscale and details, allows us to infer the socioeconomic\\nstatus, consumption habits, and the underlying social\\nstructure of millions of connected individuals. Using\\nthis information our overall goal is to identify people\\nwith certain financial capacities, and to understand\\nhow much money they spend, on what they spend,\\nand whether they spend like their friends? More pre-\\ncisely, we formulate our study around two research\\nquestions:\\n• Can one associate typical consumption patterns\\nto people and to their peers belonging to the\\nsame or different socioeconomic classes, and if\\nyes how much such patterns vary between indi-\\nviduals or different classes?\\n• Can one draw relations between commonly pur-\\nchased goods or services in order to understand\\nbetter individual consumption behaviour?\\nAfter reviewing the related literature in Section 2,\\nwe describe our dataset in Section 3, and introduce\\nindividual socioeconomic indicators to define socioe-\\nconomic classes in Section 4. In Section 5 we show\\nhow typical consumption patterns vary among classes\\nand relate them to structural correlations in the social\\nnetwork. In Section 6 we draw a correlation network\\nbetween consumption categories to detect patterns of\\ncommonly purchased goods and services. Finally we\\npresent some concluding remarks and future research\\nideas.\\n2\\nRelated work\\nEarlier hypothesis on the relation between consump-\\ntion patterns and socioeconomic inequalities, and\\ntheir correlations with demographic features such as\\nage, gender, or social status were drawn from spe-\\ncific sociological studies [16] and from cross-national\\nsocial surveys [17]. However, recently available large\\ndatasets help us to effectively validate and draw new\\nhypotheses as population-large individual level obser-\\nvations and detailed analysis of human behavioural\\ndata became possible. These studies shown that per-\\nsonal social interactions, social influence [1], or ho-\\nmophily [22] in terms of age or gender [20] have strong\\neffects on purchase behaviour, knowledge which led\\nto the emergent domain of online social market-\\ning [21].\\nYet it is challenging to measure correla-\\ntions between individual social status, social network,\\nand purchase patterns simultaneously. Although so-\\ncioeconomic parameters can be estimated from com-\\nmunication networks [18] or from external aggregate\\ndata [19] usually they do not come together with indi-\\nvidual purchase records. In this paper we propose to\\nexplore this question through the analysis of a com-\\nbined dataset proposing simultaneous observations of\\nsocial structure, economic status and purchase habits\\nof millions of individuals.\\n3\\nData description\\nIn the following we are going to introduce two\\ndatasets extracted from a corpus combining the mo-\\nbile phone interactions with purchase history of indi-\\nviduals.\\nDS1: Ego social-economic data with\\npurchase distributions\\nCommunication data used in our study records the\\ntemporal sequence of 7,945,240,548 call and SMS in-\\nteractions of 111,719,360 anonymized mobile phone\\nusers for 21 consecutive months. Each call detailed\\nrecord (CDR) contains the time, unique caller and\\ncallee encrypted IDs, the direction (who initiate the\\ncall/SMS), and the duration of the interaction. At\\nleast one participant of each interaction is a client of a\\nsingle mobile phone operator, but other mobile phone\\nusers who are not clients of the actual provider also\\nappear in the dataset with unique IDs. All unique\\nIDs are anonymized as explained below, thus indi-\\nvidual identification of any person is impossible from\\nthe data. Using this dataset we constructed a large\\nsocial network where nodes are users (whether clients\\nor not of the actual provider), while links are drawn\\nbetween any two users if they interacted (via call or\\nSMS) at least once during the observation period. We\\nfiltered out call services, companies, and other non-\\nhuman actors from the social network by removing\\nall nodes (and connected links) who appeared with\\neither in-degree kin = 0 or out-degree kout = 0.\\nWe repeated this procedure recursively until we re-\\nceived a network where each user had kin, kout > 0,\\ni.\\ne.\\nmade at least one out-going and received at\\nleast one in-coming communication event during the\\nnearly two years of observation. After construction\\n2\\nand filtering the network remained with 82,453,814\\nusers connected by 1,002,833,289 links, which were\\nconsidered to be undirected after this point.\\nTo calculate individual economic estimators we\\nused a dataset provided by a single bank. This data\\nrecords financial details of 6,002,192 people assigned\\nwith unique anonymized identifiers over 8 consecutive\\nmonths.\\nThe data provides time varying customer\\nvariables as the amount of their debit card purchases,\\ntheir monthly loans, and static user attributes such\\nas their billing postal code (zip code), their age and\\ntheir gender.\\nA subset of IDs of the anonymized bank and mobile\\nphone customers were matched1. This way of com-\\nbining the datasets allowed us to simultaneously ob-\\nserve the social structure and estimate economic sta-\\ntus (for definition see Section 4) of the connected in-\\ndividuals. This combined dataset contained 999,456\\nIDs, which appeared in both corpuses.\\nHowever,\\nfor the purpose of our study we considered only the\\nlargest connected component of this graph. This way\\nwe operate with a connected social graph of 992,538\\npeople connected by 1,960,242 links, for all of them\\nwith communication events and detailed bank records\\navailable.\\nTo study consumption behaviour we used purchase\\nsequences recording the time, amount, merchant cat-\\negory code of each purchase event of each individual\\nduring the observation period of 8 months. Purchase\\nevents are linked to one of the 281 merchant cate-\\ngory codes (mcc) indicating the type of the actual\\npurchase, like fast food restaurants, airlines, gas sta-\\ntions, etc. Due to the large number of categories in\\nthis case we decided to group mccs by their types into\\n28 purchase category groups (PCGs) using the cate-\\ngorization proposed in [23]. After analyzing each pur-\\nchase groups 11 of them appeared with extremely low\\nactivity representing less than 0.3% (combined) of the\\ntotal amount of purchases, thus we decided to remove\\nthem from our analysis and use only the remaining\\nK17 set of 17 groups (for a complete list see Fig.2a).\\nNote that the group named Service Providers (k1\\nwith mcc 24) plays a particular role as it corresponds\\nto cash retrievals and money transfers and it repre-\\nsents around 70% of the total amount of purchases.\\nAs this group dominates over other ones, and since\\nwe have no further information how the withdrawn\\n1 The matching, data hashing, and anonymization proce-\\ndure was carried out without the involvement of the scientific\\npartner.\\nAfter this procedure only anonymized hashed IDs\\nwere shared disallowing the direct identification of individuals\\nin any of the datasets.\\ncash was spent, we analyze this group k1 separately\\nfrom the other K2-17 = K17\\\\{k1} set of groups.\\nThis way we obtained DS1, which collects the social\\nties, economic status, and coarse grained purchase\\nhabit informations of ∼1 million people connected\\ntogether into a large social network.\\nDS2: Detailed ego purchase distributions\\nwith age and gender\\nFrom the same bank transaction trace of 6,002,192\\nusers, we build a second data set DS2. This dataset\\ncollects data about the age and gender of individu-\\nals together with their purchase sequence recording\\nthe time, amount, and mcc of each debit card pur-\\nchase of each ego. To obtain a set of active users we\\nextracted a corpus of 4,784,745 people that were ac-\\ntive at least two months during the observation pe-\\nriod. Then for each ego, we assigned a feature set\\nPV (u) : {ageu, genderu, SEGu, r(ci, u)} where SEG\\nassigns a socioeconomic group (for definition see Sec-\\ntion 4) and r(ci, u) is an ego purchase distribution\\nvector defined as\\nr(ci, u) =\\nmci\\nu\\nP\\nci mci\\nu\\n.\\n(1)\\nThis vector assigns the fraction of mci\\nu money spent\\nby user u on a merchant category ci during the obser-\\nvation period. We excluded purchases corresponding\\nto cash retrievals and money transfers, which would\\ndominate our measures otherwise. A minor fraction\\nof purchases are not linked to valid mccs, thus we\\nexcluded them from our calculations.\\nThis way DS2 collects 3,680,652 individuals, with-\\nout information about their underlying social net-\\nwork, but all assigned with a PV (u) vector describing\\ntheir personal demographic and purchasing features\\nin details.\\n4\\nMeasures of socioeconomic position\\nTo estimate the personal economic status we used a\\nsimple measure reflecting the consumption power of\\neach individual. Starting from the raw data of DS2,\\nwhich collects the amount and type of debit card pur-\\nchases, we estimated the economic position of individ-\\nuals as their average monthly purchase (AMP). More\\nprecisely, in case of an ego u who spent mu(t) amount\\nin month t we calculated the AMP as\\nPu =\\nP\\nt∈T mu(t)\\n|T|u\\n(2)\\n3\\nwhere |T|u corresponds to the number of active\\nmonths of user u (with at least one purchase in each\\nmonth). After sorting people by their AMP values\\nwe computed the normalized cumulative distribution\\nfunction of Pu as\\nC(f) =\\nPf\\nf ′=0 Pu(f ′)\\nP\\nu Pu\\n(3)\\nas a function of f fraction of people.\\nThis func-\\ntion (Fig.1a) appears with high variance and sug-\\ngests large imbalances in terms of the distribution of\\neconomic capacities among individuals in agreement\\nwith earlier social theory [27].\\n0.0\\n0.2\\n0.4\\n0.6\\n0.8\\n1.0\\nf\\n0.0\\n0.2\\n0.4\\n0.6\\n0.8\\n1.0\\nCW(f)\\nCP(f)\\nf\\n(a)\\nClass 1\\nClass 4\\nClass 2\\nClass 3\\nClass 5\\nClass 8\\nClass 6\\nClass 7\\nClass 9\\n(a)\\n(b)\\nFig. 1: Social class characteristics (a) Schematic\\ndemonstration of user partitions into 9 socioe-\\nconomic classes by using the cumulative AMP\\nfunction C(f). Fraction of egos belonging to\\na given class (x axis) have the same sum of\\nAMP (P\\nu Pu)/n (y axis) for each class. (b)\\nNumber of egos (green) and the average AMP\\n⟨P⟩(in USD) per individual (yellow) in differ-\\nent classes.\\nSubsequently we used the C(f) function to assign\\negos into 9 economic classes (also called socioeco-\\nnomic classes with smaller numbers assigning lower\\nclasses) such that the sum of AMP in each class sj\\nwas the same equal to (P\\nu Pu)/n (Fig.1). We de-\\ncided to use 9 distinct classes based on the common\\nthree-stratum model [25], which identifies three main\\nsocial classes (lower, middle, and upper), and for each\\nof them three sub-classes [26]. There are several ad-\\nvantages of this classification:\\n(a) it relies merely\\non individual economic estimators, Pu, (b) naturally\\npartition egos into classes with decreasing sizes for\\nricher groups and (c) increasing ⟨P⟩average AMP\\nvalues per egos (Fig.1b).\\n5\\nSocioeconomic correlations in\\npurchasing patterns\\nIn order to address our first research question we\\nwere looking for correlations between individuals in\\ndifferent socioeconomic classes in terms of their con-\\nsumption behaviour on the level of purchase category\\ngroups.\\nWe analyzed the purchasing behaviour of\\npeople in DS1 after categorizing them into socioeco-\\nnomic classes as explained in Section 4.\\nFirst for each class sj we take every user u ∈sj\\nand calculate the mk\\nu total amount of purchases they\\nspent on a purchase category group k ∈K17. Then\\nwe measure a fractional distribution of spending for\\neach PCGs as:\\nr(k, sj) =\\nP\\nu∈sj mk\\nu\\nP\\nu∈s mku\\n,\\n(4)\\nwhere s = S\\nj sj assigns the complete set of users.\\nIn Fig.2a each line shows the r(k, sj) distributions\\nfor a PCG as the function of sj social classes, and\\nlines are sorted (from top to bottom) by the total\\namount of money spent on the actual PCG2. Interest-\\ningly, people from lower socioeconomic classes spend\\nmore on PCGs associated to essential needs, such as\\nRetail Stores (St.), Gas Stations, Service Providers\\n(cash) and Telecom, while in the contrary, other cat-\\negories associated to extra needs such as High Risk\\nPersonal Retail (Jewelry, Beauty), Mail Phone Or-\\nder, Automobiles, Professional Services (Serv.) (ex-\\ntra health services), Whole Trade (auxiliary goods),\\nClothing St., Hotels and Airlines are dominated by\\npeople from higher socioeconomic classes. Also note\\nthat concerning Education most of the money is spent\\nby the lower middle classes, while Miscellaneous St.\\n(gift, merchandise, pet St.) and more apparently En-\\ntertainment are categories where the lowest and high-\\nest classes are spending the most.\\nFrom this first analysis we can already identify\\nlarge differences in the spending behaviour of peo-\\nple from lower and upper classes.\\nTo further in-\\nvestigate these dissimilarities on the individual level,\\nwe consider the K2-17 category set as defined in sec-\\ntion 3 (category k1 excluded) and build a spending\\nvector SV (u) = [SV2(u), ..., SV17(u)] for each ego u.\\n2 Note that in our social class definition the cumulative AMP\\nis equal for each group and this way each group represents the\\nsame economic potential as a whole. Values shown in Fig.2a\\nassign the total purchase of classes. Another strategy would\\nbe to calculate per capita measures, which in turn would be\\nstrongly dominated by values associated to the richest class,\\nhiding any meaningful information about other classes.\\n4\\n(a)\\n(b)\\n(d)\\n(c)\\n(e)\\n(g)\\n(f)\\nFig. 2: Consumption correlations in the socioeconomic network (a) r(k, si) distribution of spending\\nin a given purchase category group k ∈K17 by different classes sj. Distributions are normalised\\nas in Eq.4, i.e. sums up to 1 for each category. (b) Dispersion σSV (sj) for different socioeconomic\\nclasses considering PCGs in K2-17 (dark blue) and the single category k1 (light blue). (c) (resp.\\n(d)) Heat-map matrix representation of dSV (si, sj) (resp. dk1(si, sj)) distances between the average\\nspending vectors of pairs of socioeconomic classes considering PCGs in K2-17 (resp. k1). (e) Shannon\\nentropy measures for different socioeconomic classes considering PCGs in K2-17 (dark pink) and in\\nk17 (light pink). (f) (resp. (g)) Heat-map matrix representation of the average LSV (si, sj) (resp.\\nLk1(si, sj)) measure between pairs of socioeconomic classes considering PCGs in K2-17 (resp. k1).\\nHere each item SVk(u) assigns the fraction of money\\nmk\\nu/mu that user u spent on a category k ∈K2-17\\nout of his/her mu = P\\nk∈K mk\\nu total amount of pur-\\nchases. Using these individual spending vectors we\\ncalculate the average spending vector of a given so-\\ncioeconomic class as SV (sj) = ⟨SV (u)⟩u∈sj. We as-\\nsociate SV (sj) to a representative consumer of class\\nsj and use this average vector to quantify differences\\nbetween distinct socioeconomic classes as follows.\\nThe euclidean metric between average spending\\nvectors is:\\ndSV (si, sj) = ∥SV k(si) −SV k(sj)∥2,\\n(5)\\nwhere ∥⃗v∥2 =\\npP\\nk v2\\nk assigns the L2 norm of a vec-\\ntor ⃗v. Note that the diagonal elements of dSV (si, si)\\nare equal to zero by definition. However, in Fig.2c\\nthe off-diagonal green component around the diag-\\nonal indicates that the average spending behaviour\\nof a given class is the most similar to neighboring\\nclasses, while dissimilarities increase with the gap be-\\ntween socioeconomic classes. We repeated the same\\nmeasurement separately for the single category of\\ncash purchases (PCG k1).\\nIn this case euclidean\\ndistance is defined between average scalar measures\\nas dk1(si, sj) = ∥⟨SV1⟩(si) −⟨SV1⟩(sj)∥2. Interest-\\ningly, results shown in Fig.2d.\\nindicates that here\\nthe richest social classes appear with a very different\\nbehaviour. This is due to their relative underspend-\\ning in cash, which can be also concluded from Fig.2a\\n(first row). On the other hand as going towards lower\\nclasses such differences decrease as cash usage starts\\nto dominate.\\nTo explain better the differences between socioe-\\nconomic classes in terms of purchasing patterns, we\\nintroduce two additional scalar measures. First, we\\nintroduce the dispersion of individual spending vec-\\ntors as compared to their class average as\\nσSV (sj) = ⟨∥SV k(sj) −SVk(u)∥2⟩u∈sj,\\n(6)\\nwhich appears with larger values if people in a given\\nclass allocate their spending very differently. Second,\\nwe also calculate the Shannon entropy of spending\\npatterns as\\nSSV (sj) =\\nX\\nk∈K2-17\\n−SV k(sj) log(SV k(sj))\\n(7)\\nto quantify the variability of the average spending\\nvector for each class. This measure is minimal if each\\nego of a class sj spends exclusively on the same sin-\\ngle PCG, while it is maximal if they equally spend on\\neach PCG. As it is shown in Fig.2b (light blue line\\n5\\nwith square symbols) dispersion decreases rapidly as\\ngoing towards higher socioeconomic classes. This as-\\nsigns that richer people tends to be more similar in\\nterms of their purchase behaviour.\\nOn the other\\nhand, surprisingly, in Fig.2e (dark pink line with\\nsquare symbols) the increasing trend of the corre-\\nsponding entropy measure suggests that even richer\\npeople behave more similar in terms of spending be-\\nhaviour they used to allocate their purchases in more\\nPCGs. These trends are consistent even in case of\\nk1 cash purchase category (see σSV1(sj) function de-\\npicted with dark blue line in in Fig.2b) or once we in-\\nclude category k1 into the entropy measure SSV17(sj)\\n(shown in Fig.2b with light pink line).\\nTo complete our investigation we characterize the\\neffects of social relationships on the purchase habits\\nof individuals. We address this problem through an\\noverall measure quantifying differences between indi-\\nvidual purchase vectors of connected egos positioned\\nin the same or different socioeconomic classes. More\\nprecisely, we consider each social tie (u, v) ∈E con-\\nnecting individuals u ∈si and v ∈sj, and for each\\npurchase category k we calculate the average absolute\\ndifference of their purchase vector items as\\ndk(si, sj) = ⟨|SVk(u) −SVk(v)|⟩u∈si,v∈sj.\\n(8)\\nFollowing that, as a reference system we generate a\\ncorresponding configuration network by taking ran-\\ndomly selected edge pairs from the underlying social\\nstructure and swap them without allowing multiple\\nlinks and self loops.\\nIn order to vanish any resid-\\nual correlations we repeated this procedure in 5×|E|\\ntimes.\\nThis randomization keeps the degree, indi-\\nvidual economic estimators Pu, the purchase vector\\nSV (u), and the assigned class of each people un-\\nchanged, but destroys any structural correlations be-\\ntween egos in the social network, consequently be-\\ntween socioeconomic classes as well. After generating\\na reference structure we computed an equivalent mea-\\nsure dk\\nrn(si, sj) but now using links (u, v) ∈Ern of the\\nrandomized network. We repeated this procedure 100\\ntimes and calculated an average ⟨dk\\nrn⟩(si, sj). In or-\\nder to quantify the effect of the social network we\\nsimply take the ratio\\nLk(si, sj) =\\ndk(si, sj)\\n⟨dkrn⟩(si, sj)\\n(9)\\nand calculate its average LSV (si, sj) = ⟨Lk(si, sj)⟩k\\nover each category group k ∈K2-17 or respectively k1.\\nThis measure shows whether connected people have\\nmore similar purchasing patterns than one would ex-\\npect by chance without considering any effect of ho-\\nmophily, social influence or structural correlations.\\nResults depicted in Fig.2f and 2g for LSV (si, sj) (and\\nLk1(si, sj) respectively) indicates that the purchas-\\ning patterns of individuals connected in the original\\nstructure are actually more similar than expected by\\nchance (diagonal component).\\nOn the other hand\\npeople from remote socioeconomic classes appear to\\nbe less similar than one would expect from the uncor-\\nrelated case (indicated by the LSV (si, sj) > 1 values\\ntypical for upper classes in Fig.2f).\\nNote that we\\nfound the same correlation trends in cash purchase\\npatterns as shown in Fig.2g. These observations do\\nnot clearly assign whether homophily [12, 13] or so-\\ncial influence [1] induce the observed similarities in\\npurchasing habits but undoubtedly clarifies that so-\\ncial ties (i.e. the neighbors of an ego) and socioeco-\\nnomic status play deterministic roles in the emerging\\nsimilarities in consumption behaviour.\\n6\\nPurchase category correlations\\nTo study consumption patterns of single purchase\\ncategories PCGs provides a too coarse grained level\\nof description. Hence, to address our second ques-\\ntion we use DS2 and we downscale from the category\\ngroup level to the level of single merchant categories.\\nWe are dealing with 271 categories after excluding\\nsome with less than 100 purchases and the categories\\nlinked to money transfer and cash retrieval (for a\\ncomplete list of IDs and name of the purchase cat-\\negories considered see Table 1). As in Section 3 we\\nassign to each ego u a personal vector PV (u) of four\\nsocioeconomic features: the age, the gender, the so-\\ncial economic group, and the distribution r(ci, u) of\\npurchases in different merchant categories made by\\nthe central ego. Our aim here is to obtain an overall\\npicture of the consumption structure at the level of\\nmerchant categories and to understand precisely how\\npersonal and socioeconomic features correlate with\\nthe spending behaviour of individuals and with the\\noverall consumption structure.\\nAs we noted in section 5, the purchase spending\\nvector r(ci, u) of an ego quantifies the fraction of\\nmoney spent on a category ci. Using the spending\\nvectors of n number of individuals we define an over-\\nall correlation measure between categories as\\nρ(ci, cj) =\\nn(P\\nu r(ci, u)r(cj, u))\\n(P\\nu r(ci, u))(P\\nu r(cj, u)).\\n(10)\\n6\\n5211\\n1711\\n5251\\n5533\\n5942\\n2741\\n5943\\n5964\\n4111\\n4011\\n4112\\n4511\\n4722\\n5651\\n5813\\n5947\\n7011\\n4121\\n4131\\n4789\\n5309\\n5331\\n5732\\n5948\\n5993\\n5999\\n7922\\n7991\\n7999\\n9399\\n5691\\n7399\\n4215\\n4784\\n4816\\n5192\\n5399\\n5734\\n5735\\n5811\\n5812\\n5814\\n5968\\n5969\\n5970\\n5992\\n5994\\n7216\\n7230\\n7298\\n7311\\n7392\\n7512\\n7523\\n7542\\n7933\\n7941\\n7996\\n7997\\n8999\\n5967\\n5045\\n5046\\n5065\\n5085\\n5111\\n5995\\n7538\\n4582\\n5200\\n5310\\n5541\\n9311\\n4812\\n7321\\n4899\\n7372\\n7994\\n5945\\n7273\\n5983\\n4900\\n5039\\n5013\\n5072\\n5198\\n5511\\n5532\\n5021\\n5712\\n5231\\n5719\\n5950\\n5733\\n7993\\n5047\\n8011\\n8021\\n8062\\n8071\\n5722\\n5074\\n5094\\n5621\\n5631\\n5699\\n5944\\n5977\\n5131\\n5441\\n5949\\n5122\\n5137\\n5661\\n5139\\n5169\\n5172\\n5193\\n5714\\n7629\\n763\\n5655\\n5641\\n5451\\n5462\\n5973\\n5542\\n7622\\n5599\\n5571\\n5611\\n5935\\n5941\\n5697\\n5681\\n5931\\n5971\\n7296\\n7297\\n7841\\n7832\\n7210\\n7211\\n7932\\n8049\\n5921\\n7929\\n5940\\n5976\\n8641\\n5946\\n7338\\n7221\\n5965\\n7277\\n742\\n7299\\n7998\\n7361\\n8099\\n7995\\n8211\\n8220\\n(a)\\n(b)\\nCar sales and maintenance\\nHardware stores\\nOffice supply stores\\nIT services\\nBooks and newspapers\\nState services and education\\nHome supply stores\\nNewsstand and duty-free shops\\nAmusement and recreation\\nTravelling\\nTransportation and commuting\\nLeisure\\nJewellery and gift shops\\nClothing 1\\nClothing 2\\nPersonal services\\nHealth and medical services\\nFig. 3: Merchant category correlation matrix and graph (a) 163×163 matrix heatmap plot corre-\\nsponding to ρ(ci, cj) correlation values (see Eq. 10) between categories. Colors scale with the loga-\\nrithm of correlation values. Positive (resp. negative) correlations are assigned by red (resp. blue)\\ncolors. Diagonal components represent communities with frames colored accordingly.(b) Weighted\\nG>\\nρ correlation graph with nodes annotated with MCCs (see Table 1). Colors assign 17 communities\\nof merchant categories with representative names summarized in the figure legend.\\n0\\n0.5\\n1\\nfemale male\\n(a)\\n(b)\\nFig. 4: Socioeconomic parameters of merchant categories (a) Scatter plot of AFS(ci) triplets (for\\ndefinition see Eq. 11 and text) for 271 merchant categories summarized in Table 1.\\nAxis assign\\naverage age and SEG of purchase categories, while gender information are assigned by symbols. The\\nshape of symbols assigns the dominant gender (circle-female, square-male) and their size scales with\\naverage values. (b) Similar scatter plot computed for communities presented in Fig.3b. Labels and\\ncolors are explained in the legend of Fig.3a.\\n7\\nThis symmetric formulae quantifies how much peo-\\nple spend on a category ci if they spend on an other\\ncj category or vice versa. Therefore, if ρ(ci, cj) > 1,\\nthe categories ci and cj are positively correlated and\\nif ρ(ci, cj) < 1, categories are negatively correlated.\\nUsing ρ(ci, cj) we can define a weighted correlation\\ngraph Gρ = (Vρ, Eρ, ρ) between categories ci ∈Vρ,\\nwhere links (ci, cj) ∈Eρ are weighted by the ρ(ci, cj)\\ncorrelation values.\\nThe weighted adjacency matrix\\nof Gρ is shown in Fig.3a as a heat-map matrix with\\nlogarithmically scaling colors. Importantly, this ma-\\ntrix emerges with several block diagonal components\\nsuggesting present communities of strongly correlated\\ncategories in the graph.\\nTo identify categories which were commonly pur-\\nchased together we consider only links with positive\\ncorrelations. Furthermore, to avoid false positive cor-\\nrelations, we consider a 10% error on r that can in-\\nduce, in the worst case 50% overestimation of the\\ncorrelation values. In addition, to consider only rep-\\nresentative correlations we take into account category\\npairs which were commonly purchased by at least\\n1000 consumers. This way we receive a G>\\nρ weighted\\nsub-graph of Gρ, shown in Fig.3b, with 163 nodes\\nand 1664 edges with weights ρ(ci, cj) > 1.5.\\nTo identify communities in G>\\nρ indicated by the\\ncorrelation matrix in Fig.3a we applied a graph parti-\\ntioning method based on the Louvain algorithm [28].\\nWe obtained 17 communities depicted with differ-\\nent colors in Fig.3b and as corresponding colored\\nframes in Fig.3a.\\nInterestingly, each of these com-\\nmunities group a homogeneous set of merchant cat-\\negories, which could be assigned to similar types of\\npurchasing activities (see legend of Fig.3b). In addi-\\ntion, this graph indicates how different communities\\nare connected together. Some of them, like Trans-\\nportation, IT or Personal Serv.\\nplaying a central\\nrole as connected to many other communities, while\\nother components like Car sales and maintenance\\nand Hardware St., or Personal and Health and med-\\nical Serv. are more like pairwise connected. Some\\ngroups emerge as standalone communities like Office\\nSupp.\\nSt., while others like Books and newspapers\\nor Newsstands and duty-free Shops (Sh.) appear as\\nbridges despite their small sizes.\\nNote that the main categories corresponding to\\neveryday necessities related to food (Supermarkets,\\nFood St.)\\nand telecommunication (Telecommunica-\\ntion Serv.) do not appear in this graph. Since they\\nare responsible for the majority of total spending,\\nthey are purchased necessarily by everyone without\\nobviously enhancing the purchase in other categories,\\nthus they do not appear with strong correlations.\\nFinally we turn to study possible correlations\\nbetween\\npurchase\\ncategories\\nand\\npersonal\\nfea-\\ntures.\\nAn\\naverage\\nfeature\\nset\\nAFS(ci)\\n=\\n{⟨age(ci)⟩, ⟨gender(ci)⟩, ⟨SEG(ci}⟩) is assigned to\\neach of the 271 categories.\\nThe average ⟨v(ci)⟩of\\na feature v ∈{age, gender, SEG} assigns a weighted\\naverage value computed as:\\n⟨v(ci)⟩=\\nP\\nu∈{u}i αi(vu)vu\\nP\\nu∈{u}u αi(v) ,\\n(11)\\nwhere vu denotes a feature of a user u from the {u}i\\nset of individuals who spent on category ci. Here\\nαi(vu) =\\nX\\n(u∈{u}i|vu=v)\\nr(ci, u)\\nni(vu)\\n(12)\\ncorresponds to the average spending on category ci\\nof the set of users from {u}i sharing the same value\\nof the feature v. ni(vu) denotes the number of such\\nusers. In other words, e.g. in case of v = age and c742,\\n⟨age(c742)⟩assigns the average age of people spent\\non Veterinary Services (mcc = 742) weighted by the\\namount they spent on it. In case of v = gender we\\nassigned 0 to females and 1 to males, thus the average\\ngender of a category can take any real value between\\n[0, 1], indicating more females if ⟨gender(ci)⟩≤0.5\\nor more males otherwise.\\nWe visualize this multi-modal data in Fig.4a as\\na scatter plot, where axes scale with average age\\nand SEG, while the shape and size of symbols corre-\\nspond to the average gender of each category. To fur-\\nther identify correlations we applied k-means cluster-\\ning [29] using the AFS(ci) of each category. The ideal\\nnumber of clusters was 15 according to several crite-\\nria: Davies-Bouldin Criterion, Calinski-Harabasz cri-\\nterion (variance ratio criterion) and the Gap method\\n[30].\\nColors in Fig.4a assign the identified k-mean\\nclusters.\\nThe first thing to remark in Fig.4a is that the av-\\nerage age and SEG assigned to merchant categories\\nare positively correlated with a Pearson correlation\\ncoefficient 0.42 (p < 0.01). In other words, elderly\\npeople used to purchase from more expensive cate-\\ngories, or alternatively, wealthier people tend to be\\nolder, in accordance with our intuition. At the same\\ntime, some signs of gender imbalances can be also\\nconcluded from this plot. Wealthier people appear to\\nbe commonly males rather than females. A Pearson\\ncorrelation measure between gender and SEG, which\\n8\\n742: Veterinary Serv.\\n5072: Hardware Supp.\\n5598: Snowmobile Dealers\\n5950: Glassware, Crystal St.\\n7296: Clothing Rental\\n7941: Sports Clubs\\n763: Agricultural Cooperative\\n5074: Plumbing, Heating Equip.\\n5599: Auto Dealers\\n5960: Dir Mark - Insurance\\n7297: Massage Parlors\\n7991: Tourist Attractions\\n780: Landscaping Serv.\\n5085: Industrial Supplies\\n5611: Men Cloth. St.\\n5962: Direct Marketing - Travel\\n7298: Health and Beauty Spas\\n7992: Golf Courses\\n1520: General Contr.\\n5094: Precious Objects/Stones\\n5621: Wom Cloth. St.\\n5963: Door-To-Door Sales\\n7299: General Serv.\\n7993: Video Game Supp.\\n1711: Heating, Plumbing\\n5099: Durable Goods\\n5631: Women?s Accessory Sh. 5964: Dir. Mark. Catalog\\n7311: Advertising Serv.\\n7994: Video Game Arcades\\n1731: Electrical Contr.\\n5111: Printing, Office Supp.\\n5641: Children?s Wear St.\\n5965: Dir. Mark. Retail Merchant 7321: Credit Reporting Agencies\\n7995: Gambling\\n1740: Masonry & Stonework\\n5122: Drug Proprietaries\\n5651: Family Cloth. St.\\n5966: Dir Mark - TV\\n7333: Graphic Design\\n7996: Amusement Parks\\n1750: Carpentry Contr.\\n5131: Notions Goods\\n5655: Sports & Riding St.\\n5967: Dir. Mark.\\n7338: Quick Copy\\n7997: Country Clubs\\n1761: Sheet Metal\\n5137: Uniforms Clothing\\n5661: Shoe St.\\n5968: Dir. Mark. Subscription\\n7339: Secretarial Support Serv.\\n7998: Aquariums\\n1771: Concrete Work Contr.\\n5139: Commercial Footwear\\n5681: Furriers Sh.\\n5969: Dir. Mark. Other\\n7342: Exterminating Services\\n7999: Recreation Serv.\\n1799: Special Trade Contr.\\n5169: Chemicals Products\\n5691: Cloth. Stores\\n5970: Artist?s Supp.\\n7349: Cleaning and Maintenance\\n8011: Doctors\\n2741: Publishing and Printing 5172: Petroleum Products\\n5697: Tailors\\n5971: Art Dealers & Galleries\\n7361: Employment Agencies\\n8021: Dentists, Orthodontists\\n2791: Typesetting Serv.\\n5192: Newspapers\\n5698: Wig and Toupee St.\\n5972: Stamp and Coin St.\\n7372: Computer Programming\\n8031: Osteopaths\\n2842: Specialty Cleaning\\n5193: Nursery & Flowers Supp.\\n5699: Apparel Accessory Sh.\\n5973: Religious St.\\n7375: Information Retrieval Serv.\\n8041: Chiropractors\\n4011: Railroads\\n5198: Paints\\n5712: Furniture\\n5975: Hearing Aids\\n7379: Computer Repair\\n8042: Optometrists\\n4111: Ferries\\n5199: Nondurable Goods\\n5713: Floor Covering St.\\n5976: Orthopedic Goods\\n7392: Consulting, Public Relations 8043: Opticians\\n4112: Passenger Railways\\n5200: Home Supply St.\\n5714: Window Covering St.\\n5977: Cosmetic St.\\n7393: Detective Agencies\\n8049: Chiropodists, Podiatrists\\n4119: Ambulance Serv.\\n5211: Materials St.\\n5718: Fire Accessories St.\\n5978: Typewriter St.\\n7394: Equipment Rental\\n8050: Nursing/Personal Care\\n4121: Taxicabs\\n5231: Glass & Paint St.\\n5719: Home Furnishing St.\\n5983: Fuel Dealers (Non Auto)\\n7395: Photo Developing\\n8062: Hospitals\\n4131: Bus Lines\\n5251: Hardware St.\\n5722: House St.\\n5992: Florists\\n7399: Business Serv.\\n8071: Medical Labs\\n4214: Motor Freight Carriers\\n5261: Nurseries & Garden St.\\n5732: Elec. St.\\n5993: Cigar St.\\n7512: Car Rental Agencies\\n8099: Medical Services\\n4215: Courier Serv.\\n5271: Mobile Home Dealers\\n5733: Music Intruments St.\\n5994: Newsstands\\n7513: Truck/Trailer Rentals\\n8111: Legal Services, Attorneys\\n4225: Public Storage\\n5300: Wholesale\\n5734: Comp.Soft. St.\\n5995: Pet Sh.\\n7519: Mobile Home Rentals\\n8211: Elem. Schools\\n4411: Cruise Lines\\n5309: Duty Free St.\\n5735: Record Stores\\n5996: Swimming Pools Sales\\n7523: Parking Lots, Garages\\n8220: Colleges Univ.\\n4457: Boat Rentals and Leases 5310: Discount Stores\\n5811: Caterers\\n5997: Electric Razor St.\\n7531: Auto Body Repair Sh.\\n8241: Correspondence Schools\\n4468: Marinas Serv. and Supp. 5311: Dep. St.\\n5812: Restaurants\\n5998: Tent and Awning Sh.\\n7534: Tire Retreading & Repair\\n8244: Business Schools\\n4511: Airlines\\n5331: Variety Stores\\n5813: Drinking Pl.\\n5999: Specialty Retail\\n7535: Auto Paint Sh.\\n8249: Training Schools\\n4582: Airports, Flying Fields\\n5399: General Merch.\\n5814: Fast Foods\\n6211: Security Brokers\\n7538: Auto Service Shops\\n8299: Educational Serv.\\n4722: Travel Agencies\\n5411: Supermarkets\\n5912: Drug St.\\n6300: Insurance\\n7542: Car Washes\\n8351: Child Care Serv.\\n4784: Tolls/Bridge Fees\\n5422: Meat Prov.\\n5921: Alcohol St.\\n7011: Hotels\\n7549: Towing Serv.\\n8398: Donation\\n4789: Transportation Serv.\\n5441: Candy St.\\n5931: Secondhand Stores\\n7012: Timeshares\\n7622: Electronics Repair Sh.\\n8641: Associations\\n4812: Phone St.\\n5451: Dairy Products St.\\n5932: Antique Sh.\\n7032: Sporting Camps\\n7623: Refrigeration Repair\\n8651: Political Org.\\n4814: Telecom.\\n5462: Bakeries\\n5933: Pawn Shops\\n7033: Trailer Parks, Camps\\n7629: Small Appliance Repair\\n8661: Religious Orga.\\n4816: Comp. Net. Serv.\\n5499: Food St.\\n5935: Wrecking Yards\\n7210: Laundry, Cleaning Serv.\\n7631: Watch/Jewelry Repair\\n8675: Automobile Associations\\n4821: Telegraph Serv.\\n5511: Cars Sales\\n5937: Antique Reproductions 7211: Laundries\\n7641: Furniture Repair\\n8699: Membership Org.\\n4899: Techno St.\\n5521: Car Repairs Sales\\n5940: Bicycle Sh.\\n7216: Dry Cleaners\\n7692: Welding Repair\\n8734: Testing Lab.\\n4900: Utilities\\n5531: Auto and Home Supp. St.\\n5941: Sporting St.\\n7217: Upholstery Cleaning\\n7699: Repair Sh.\\n8911: Architectural Serv.\\n5013: Motor Vehicle Supp.\\n5532: Auto St.\\n5942: Book St.\\n7221: Photographic Studios\\n7829: Picture/Video Production\\n8931: Accounting Serv.\\n5021: Commercial Furniture\\n5533: Auto Access.\\n5943: Stationery St.\\n7230: Beauty Sh.\\n7832: Cinema\\n8999: Professional Serv.\\n5039: Constr. Materials\\n5541: Gas Stations\\n5944: Jewelry St.\\n7251: Shoe Repair/Hat Cleaning\\n7841: Video Tape Rental St.\\n9211: Courts of Law\\n5044: Photographic Equip.\\n5542: Automated Fuel Dispensers 5945: Toy,-Game Sh.\\n7261: Funeral Serv.\\n7911: Dance Hall & Studios\\n9222: Government Fees\\n5045: Computer St.\\n5551: Boat Dealers\\n5946: Camera and Photo St.\\n7273: Dating/Escort Serv.\\n7922: Theater Ticket\\n9223: Bail and Bond Payments\\n5046: Commercial Equipment\\n5561: Motorcycle Sh.\\n5947: Gift Sh.\\n7276: Tax Preparation Serv.\\n7929: Bands, Orchestras\\n9311: Tax Payments\\n5047: Medical Equipment\\n5571: Motorcycle Sh.\\n5948: Luggage & Leather St.\\n7277: Counseling Services\\n7932: Billiard/Pool\\n9399: Government Serv.\\n5051: Metal Service Centers\\n5592: Motor Homes Dealers\\n5949: Fabric St.\\n7278: Buying/Shopping Serv.\\n7933: Bowling\\n9402: Postal Serv.\\n5065: Electrical St.\\nTab. 1: Codes and names of 271 merchant categories used in our study. MCCs were taken from the Merchant\\nCategory Codes and Groups Directory published by American Express [23]. Abbreviations corre-\\nspond to: Serv. - Services, Contr. - Contractors, Supp. - Supplies, St. - Stores, Equip. - Equipment,\\nMerch. - Merchandise, Prov. - Provisioners, Pl. - Places, Sh. - Shops, Mark. - Marketing, Univ. -\\nUniversities, Org. - Organizations, Lab. - Laboratories.\\nappears with a coefficient 0.29 (p < 0.01) confirmed\\nit. On the other hand, no strong correlation was ob-\\nserved between age and gender from this analysis.\\nTo have an intuitive insight about the distribution\\nof merchant categories, we take a closer look at spe-\\ncific category codes (summarized in Table 1).\\nAs\\nseen in Fig.4a elderly people tend to purchase in spe-\\ncific categories such as Medical Serv., Funeral Serv.,\\nReligious Organisations, Motorhomes Dealers, Dona-\\ntion, Legal Serv..\\nWhereas categories such as Fast\\nFoods, Video Game Arcades, Cinema, Record St., Ed-\\nucational Serv., Uniforms Clothing, Passenger Rail-\\nways, Colleges-Universities are associated to younger\\nindividuals on average.\\nAt the same time, wealth-\\nier people purchase more in categories as Snowmo-\\nbile Dealers, Secretarial Serv., Swimming Pools Sales,\\nCar Dealers Sales, while poorer people tend to pur-\\nchase more in categories related to everyday neces-\\nsities like Food St., General Merch., Dairy Products\\nSt., Fast Foods and Phone St., or to entertainment as\\nBilliard or Video Game Arcades. Typical purchase\\ncategories are also strongly correlated with gender as\\ncategories more associated to females are like Beauty\\nSh., Cosmetic St., Health and Beauty Spas, Women\\nClothing St. and Child Care Serv., while others are\\npreferred by males like Motor Homes Dealers, Snow-\\nmobile Dealers, Dating/Escort Serv., Osteopaths, In-\\nstruments St., Electrical St., Alcohol St. and Video\\nGame Arcades.\\nFinally we repeated a similar analysis on commu-\\nnities shown in Fig.3b, but computing the AFS on a\\nset of categories that belong to the same community.\\nResults in Fig.4b disclose positive age-SEG correla-\\ntions as observed in Fig.4a, together with somewhat\\n9\\nintuitive distribution of the communities.\\n7\\nConclusion\\nIn this paper we analyzed a multi-modal dataset col-\\nlecting the mobile phone communication and bank\\ntransactions of a large number of individuals living\\nin a single country. This corpus allowed for an in-\\nnovative global analysis both in term of social net-\\nwork and its relation to the economical status and\\nmerchant habits of individuals. We introduced sev-\\neral measures to estimate the socioeconomic status of\\neach individual together with their purchasing habits.\\nUsing these information we identified distinct socioe-\\nconomic classes, which reflected strongly imbalanced\\ndistribution of purchasing power in the population.\\nAfter mapping the social network of egos from mo-\\nbile phone interactions, we showed that typical con-\\nsumption patterns are strongly correlated with the\\nsocioeconomic classes and the social network behind.\\nWe observed these correlations on the individual and\\nsocial class level.\\nIn the second half of our study we detected corre-\\nlations between merchant categories commonly pur-\\nchased together and introduced a correlation network\\nwhich in turn emerged with communities grouping\\nhomogeneous sets of categories. We further analyzed\\nsome multivariate relations between merchant cate-\\ngories and average demographic and socioeconomic\\nfeatures, and found meaningful patterns of correla-\\ntions giving insights into correlations in purchasing\\nhabits of individuals.\\nWe identified several new directions to explore in\\nthe future.\\nOne possible track would be to better\\nunderstand the role of the social structure and inter-\\npersonal influence on individual purchasing habits,\\nwhile the exploration of correlated patterns between\\ncommonly purchased brands assigns another promis-\\ning directions. Beyond our general goal to better un-\\nderstand the relation between social and consuming\\nbehaviour these results may enhance applications to\\nbetter design marketing, advertising, and recommen-\\ndation strategies, as they assign relations between co-\\npurchased product categories.\\nAcknowledgment\\nWe thank M. Fixman for assistance.\\nWe acknowl-\\nedge the support from the SticAmSud UCOOL\\nproject, INRIA, and the SoSweet (ANR-15-CE38-\\n0011-01) and CODDDE (ANR-13-CORD-0017-01)\\nANR projects.\\nReferences\\n[1] A. Deaton, Understanding Consumption. Claren-\\ndon Press (1992).\\n[2] A. Deaton and J. Muellbauer, Economics and\\nConsumer Behavior. Cambridge University Press\\n(1980).\\n[3] T. Piketti, Capital in the Twenty-First Century.\\n(Harvard University Press, 2014).\\n[4] S. Sernau, Social Inequality in a Global Age.\\n(SAGE Publications, 2013).\\n[5] C. E. Hurst, Social Inequality. 8th ed. (Pearson\\nEducation, 2015).\\n[6] J. E. Fisher, Social Class and Consumer Behavior:\\nthe Relevance of Class and Status”, in Advances\\nin Consumer Research Vol. 14, eds. M. Wallen-\\ndorf and P. Anderson, Provo, UT : Association\\nfor Consumer Research, pp 492–496 (1987) .\\n[7] S. Wasserman, K. Faust, Social Network Analy-\\nsis: Methods and Applications. (Cambridge Uni-\\nversity Press, 1994).\\n[8] S. Lohr, The age of big data. (New York Times,\\n2012).\\n[9] D. Lazer, et. al. Computational Social Science.\\nScience 323, 721–723 (2009)\\n[10] A. Abraham, A-E. Hassanien, V. Smasel (eds.),\\nComputational Social Network Analysis: Trends,\\nTools and Research Advances. (Springer-Verlag,\\n2010).\\n[11] P. Bourdieu, Distinction: A Social Critique of\\nthe Judgement of Taste. Harvard University Press\\n(Cambridge MA) (1984).\\n[12] M. McPherson, L. Smith-Lovin, J. M. Cook,\\nBirds of a Feather:\\nHomophily in Social Net-\\nworks. Ann. Rev. Sociol. 27 415–444 (2001).\\n[13] P. F. Lazarsfeld, R. K. Merton, Friendship as a\\nSocial Process: A Substantive and Methodologi-\\ncal Analysis. In Freedom and Control in Modern\\nSociety. (New York: Van Nostrand, 1954) pp. 18–\\n66.\\n10\\n[14] D. B. Grusky, Theories of Stratification and In-\\nequality. In The Concise Encyclopedia of Sociol-\\nogy. pp. 622-624. (Wiley-Blackwell, 2011).\\n[15] P. West, Conspicuous Compassion: Why Some-\\ntimes It Really Is Cruel To Be Kind. Civitas, In-\\nstitute for the Study of Civil Society (London)\\n(2004).\\n[16] T. W. Chang, Social status and cultural con-\\nsumption Cambridge University Press (2010)\\n[17] A. Deaton, The analysis of household surveys: a\\nmicroeconometric approach to development pol-\\nicy. World Bank Publications (1997)\\n[18] Y. Dong, et. al., Inferring user demographics and\\nsocial strategies in mobile social networks. Proc.\\nof the 20th ACM SIGKDD international confer-\\nence on Knowledge discovery and data mining,\\n15–24 (2014)\\n[19] N. Eagle, M. Macy, R. Claxton, Network di-\\nversity and economic development. Science 328,\\n1029–1031 (2010)\\n[20] L. Kovanen, et. al., Temporal motifs reveal ho-\\nmophily, gender-specific patterns, and group talk\\nin call sequences. Proc. Nat. Acad. Sci., 110,\\n18070–18075 (2013)\\n[21] R. Felix, P. A. Rauschnabel, C. Hinsch, Elements\\nof strategic social media marketing: A holistic\\nframework. J. Business Res. online 1st (2016)\\n[22] W. Wood, T. Hayes, Social Influence on con-\\nsumer decisions:\\nMotives, modes, and conse-\\nquences. J. Consumer Psych. 22, 324–328 (2012).\\n[23] Merchant Category Codes and Groups Direc-\\ntory. American Express @ Work Reporting Ref-\\nerence (http://tinyurl.com/hne9ct5) (2008) (date\\nof access: 2/3/2016).\\n[24] P. Martineau, Social classes and spending behav-\\nior. Journal of Marketing 121–130 (1958).\\n[25] D.F. Brown, Social class and Status. In Mey, Ja-\\ncob Concise Encyclopedia of Pragmatics. Elsevier\\np. 953 (2009).\\n[26] P. Saunders, Social Class and Stratification.\\n(Routledge, 1990).\\n[27] V. Pareto, Manual of Political Economy. Reprint\\n(New English Trans) edition (1971).\\n[28] V. Blondel, et. al., Fast unfolding of communi-\\nties in large networks. J. Stat.l Mech: theory and\\nexperiment P10008 (2008).\\n[29] C. M. Bishop, Neural Networks for Pattern\\nRecognition. (Oxford University Press, Oxford,\\nEngland) (1995).\\n[30] R. Tibshirani, G. Walther, T. Hastie, Estimating\\nthe number of clusters in a data set via the gap\\nstatistic. J. Roy. Stat. Soc. B 63, 411-423 (2001).\\n11\\n\\n\\n\\n---\\n\\n\\nThe Masterclass of particle physics and scientific\\ncareers from the point of view of male and female\\nstudents\\nSandra Leone∗\\nINFN Sezione di Pisa\\nE-mail: sandra.leone@pi.infn.it\\nThe Masterclass of particle physics is an international outreach activity which provides an op-\\nportunity for high-school students to discover particle physics. The National Institute of Nuclear\\nPhysics (INFN) in Pisa has taken part in this effort since its first year, in 2005. The Masterclass\\nhas become a point of reference for the high schools of the Tuscan area around Pisa. Each year\\nmore than a hundred students come to our research center for a day. They listen to lectures, per-\\nform measurements on real data and finally they join the participants from the other institutes in a\\nvideo conference, to discuss their results. At the end of the day a questionnaire is given to the stu-\\ndents to assess if the Masterclass met a positive response. Together with specific questions about\\nthe various activities they took part in during the day, we ask them if they would like to become\\na scientist. They are offered 15 possible motivations for a “yes” or a “no” to choose from. The\\ndata collected during the years have been analysed from a gender perspective. Attracting female\\nstudents to science and technology-related careers is a very real issue in the European countries.\\nWith this study we tried to investigate if male and female students have a different perception of\\nscientific careers. At the end, we would like to be able to provide hints on how to intervene to\\ncorrect the path that seems to naturally bring male students towards STEM disciplines (science,\\ntechnology, engineering, and mathematics) and reject female students from them.\\n38th International Conference on High Energy Physics\\n3-10 August 2016\\nChicago, USA\\n∗Speaker.\\nc\\n⃝Copyright owned by the author(s) under the terms of the Creative Commons\\nAttribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).\\nhttp://pos.sissa.it/\\narXiv:1611.05297v1 [physics.ed-ph] 16 Nov 2016\\nMasterclass and scientific careers\\nSandra Leone\\n1. Introduction\\nThe International Masterclasses for Particle Physics (MC) give students the opportunity to be\\nparticle physicists for a day [1]. Each year in spring high school students and their teachers spend\\none day in reasearch institutes and universities around the world. They first attend introductory\\nlectures about particle physics (on the standard model of elementary particles, accelerators and\\ndetectors), then they work as scientists, making measurements on real data collected at CERN by\\nthe LHC experiments. At the end of their research day they experience the international aspect of\\nreal collaborations in particle physics, by presenting their findings in a video linkup with CERN or\\nFermilab and student groups in other participating countries.\\nThe Pisa unit of the National Institute for Nuclear Physics joined the MC since the first year,\\nin 2005 (World Year of Physics) [2]. Each year more than a hundred students 18-19 years old\\nattending the last year (the fifth one) of high school come to our institute. They are selected by\\ntheir schools, taking into account their expression of interest for the initiative and the previous year\\ngrades; in addition, since a few years we ask teachers to reflect the gender distribution of the school\\nin the list of selected students.\\nAt the end of the videoconference a questionnaire is given to the students to assess if the Mas-\\nterclass met a positive response. Approximately 80% of the students taking part to the Masterclass\\nfill the questionnaire. Together with specific questions about the various activities they attended\\nduring the day, we ask them if they would like to become a scientist. The data collected since 2010\\nhave been analyzed from a gender perspective. About 500 students filled the questionnaire, 300\\nmale and 200 female students.\\n2. Analysis of the questionnaire: general part\\nWe ask the students several questions related to the various aspects of the Masterclass: were\\nthe lectures understandable? was your physics background adequate? was the measurement fun?\\nwas the videoconference easy to follow? Then we ask them more general questions: were the Mas-\\nterclass topics interesting? was the Masterclass helpful to better understand what physics is and for\\nthe choise of your future studies? after taking part to the Masterclass, is your interest for physics\\nless, equal, or more than before? is it worth to participate to a particle physics Masterclass?\\nFig. 1 shows an example of the answers to some of the questions, in blue for male students, in\\nred for female students. One can see that the distribution of answers is very similar, for male and\\nfemale students. Fig. 2 (left) shows the only question for which we get a different distribution of\\nthe answers: are you interested in physics outside school? A similar pattern was already observed\\nin a very preliminary study performed on a smaller number of questionnaire in 2010 [3].\\n3. Analysis of the questionnaire: would you like to be a scientist?\\nFinally, we ask the students: would you like to work or do research in a STEM (physics,\\ntechnology, engeneering, and mathematics) discipline? The distribution of their answers is shown\\nin fig. 2 (right). A certain difference between male and female answers is seen.\\n1\\nMasterclass and scientific careers\\nSandra Leone\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nNO \\nPLUS NO PLUS YES \\nYES \\nMale \\nFemale \\nWere the Masterclass topics interesting? \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\nNO \\nPLUS NO PLUS YES \\nYES \\nMale \\nFemale \\nWas the Masterclass useful to understand \\nwhat is physics? \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nLess \\nAs before \\nIncreased \\nMale \\nFemale \\nAfter taking part to the Masterclass your interest \\nfor physics is... \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nNO \\nPLUS NO PLUS YES \\nYES \\nMale \\nFemale \\nWas it worth it to participate? \\nFigure 1: Distribution (in %) of some of the answers given by male and female students.\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\nYES \\nNO \\nMale \\nFemale \\nAre you interested in physics outside school? \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\n100 \\nYES \\nNO \\nMale \\nFemale \\nWould you like to be a scientist? \\nFigure 2: Left: distribution (in %) of the answer to the question: are you interested in physics outside\\nschool? A significant difference between male and female students is seen. Right: answer to the question:\\nwould you like to be a scientist?\\nWe divided the sample in students who declared to be (not to be) interested in physics outside\\nschool, and their answer to the previous question is shown in fig. 3 left (right). Now the two\\ndistributions are very similar, for male and female students.\\nThe students are offered many options to choose from, to motivate their choice, and are asked\\nto select up to a maximum of five reasons for a “yes” or a “no” among the ones listed here.\\nYes because:\\n2\\nMasterclass and scientific careers\\nSandra Leone\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\n100 \\nYES \\nNO \\nMale \\nFemale \\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\nYES \\nNO \\nMale \\nFemale \\nFigure 3: Distribution (in %) of the answers to the question: would you like to be a scientist? on the left\\n(right) for students interested (not interested) in physics outside school.\\n• It’s s easy to find a job;\\n• I have a talent for science;\\n• I see myself as a scientist;\\n• I like science;\\n• I like to do things that are considered difficult;\\n• I like the idea of studying the mysteries of the universe and finding answers to new questions;\\n• I’m not scared by the idea of working in a lab, without regular meals and hours;\\n• One can make a lot of money in science;\\n• It’s a field where one can travel a lot;\\n• The choice of career has a high priority in my life;\\n• It would make my life more interesting;\\n• I’m not scared by the prospects of an all-encompassing job;\\n• I deeply admire scientists and consider them a role model;\\n• My teachers are encouraging and are advising me to undertake a scientific career;\\n• My family is encouraging me and would be very happy if I were to choose a scientific career.\\nNo, because:\\n• It’s difficult to find a job;\\n• I have no talent for science;\\n• I cannot see myself as a scientist;\\n• I don’t like science;\\n• Scientific disciplines are too difficult;\\n• One has to study too much;\\n• I would like to do more useful work;\\n• Working in a lab without regular meals and hours is not for me;\\n• I put my personal interests first;\\n• I don’t want to sacrifice my personal life for my career;\\n• I aspire to a normal life;\\n• I’m scared by the prospects of an all-encompassing job: I want to have time for myself;\\n• There aren’t scientists who I consider as a model;\\n3\\nMasterclass and scientific careers\\nSandra Leone\\n• My teachers are discouraging me;\\n• My family is discouraging me.\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\n80 \\n90 \\nMale \\nFemale \\nYES, because.... \\nFigure 4: Distribution (in %) of the motivations for willing to be a scientist.\\nFrom the distribution of the “yes” motivations, one can notice that more male (about 40%)\\nthan female (about 20%) students think that they have a talent for science. On the other hand, more\\nfemale (about 37%) than male (about 23%) students are attracted by the idea of traveling.\\nThe interpretation of the “no” distribution is affected by large statistical uncertainties, because\\nonly about 70 students answered “no”. However, it is interesting to notice that, among them, 65%\\nof female students feel that they have no talent for science (compared to 40% of male), and a few\\nof them are discouraged by family (while no male student is). In addition, 55% of male students\\nare afraid that in science they’ll not have enough time for themselves (compared to 7% of female\\nstudents).\\n4. Conclusion\\nWe present a preliminary analysis of the answers to about 500 questionnaires filled by students\\nattending the Masterclass of particle physics in Pisa from 2010 to 2016. Looking for differences\\nin answers from male and female students, we notice that almost 80% of male students declare to\\nbe interested in physics outside school, compared to 46% of female students. About 90% of male\\n4\\nMasterclass and scientific careers\\nSandra Leone\\n0 \\n10 \\n20 \\n30 \\n40 \\n50 \\n60 \\n70 \\nMale \\nFemale \\nNO: because ... \\nFigure 5: Distribution (in %) of the motivation for not willing to be a scientist.\\nstudents say that they would like to work in a STEM discipline, compared to about 77% of female\\nstudents.\\nWe plan to continue to distribute this questionnaire to students attending the Masterclass of\\nparticle physics in Pisa and collect more data. In addition, we asked the physics teachers to propose\\nthe general section of the questionnaire concerning scientific careers also to students who will not\\nattend the Masterclass. This will provide a control sample including students not as good as the\\nones coming to the Masterclass and not necessarily interested in science as a career. We aim to\\nbetter understand in which respect male students are more interested in physics outside school than\\nfemale students. At the end, we would like to obtain hints on how to intervene to correct the path\\nthat seems to naturally bring male students towards STEM disciplines and reject female students\\nfrom them.\\nReferences\\n[1] http://physicsmasterclasses.org/\\n[2] http://www.pi.infn.it/ leone/mc/mc2016/\\n[3] G. Chiarelli, S. Leone Le Masterclass come uno strumento per affrontare il gender gap?, presented at\\n“ Comunicare Fisica 2010”.\\n5\\n\\n\\n\\n---\\n\\n\\nDEVELOPMENTS FOR THE ISODAR@KAMLAND AND DAEδALUS\\nDECAY-AT-REST NEUTRINO EXPERIMENTS\\nJOSE R. ALONSO FOR THE ISODAR COLLABORATION\\nMassachusetts Institute of Technology, 77 Massachusetts Avenue,\\nCambridge, MA, 02139, USA\\nConfigurations of the IsoDAR and DAEδALUS decay-at-rest neutrino experiments are de-\\nscribed. Injector and cyclotron developments aimed at substantial increases in beam current\\nare discussed. The IsoDAR layout and target are described, and this experiment is compared\\nto other programs searching for sterile neutrinos.\\n1\\nIntroduction\\nFigure 1 – 8Li neutrino spectrum. Dashed = actual\\nspectrum, Solid = detector response for IBD events\\nDecay-At-Rest (DAR) experiments offer attractive\\nfeatures for neutrino physics studies.1 We discuss\\ntwo particular regimes where the characteristics\\nof the source are determined by the nature of\\nthe weak-interaction decay producing the neutrino,\\nand are not affected by kinematics or characteris-\\ntics of higher-energy production mechanisms. The\\nbeta decay case is manifested in the IsoDAR ex-\\nperiment; a sterile-neutrino search where a 60 MeV\\nproton beam is used to produce the parent isotope,\\n8Li. The product nucleus is stationary when it de-\\ncays, the neutrino spectrum is shown in Figure 1.\\nIt has a high endpoint energy, over 13 MeV, and a mean energy of 6.5 MeV, both substantially\\nhigher than backgrounds from other decays, and in an area easily accessible for detection by\\nInverse Beta Decay (IBD) in a hydrogen-containing neutrino detector.\\nFigure 2 – Neutrino spectrum from stopped\\nπ+. Note absence of ¯νe.\\nIn the regime where pions are produced at low en-\\nergy (with ≤800 MeV protons), pions can stop in the\\ntarget before decaying. This is the case for DAEδALUS,\\na sensitive CP violation measurement. As the nuclear\\ncapture probability for π−at rest in the target is ex-\\ntremely high, the neutrino spectrum from the stopped\\npions will be dominated by the decay of π+ by a fac-\\ntor of about 104. Figure 2 shows the neutrino spectra\\nfrom the π+ →µ+ →e+ decay. Noteworthy in this\\ndecay is the absence of electron antineutrinos, making\\nthis source a favored means of looking for appearance of\\n¯νe, again utilizing IBD in a suitable neutrino detector.\\nThese neutrino sources are isotropic, there is no\\narXiv:1611.03548v1 [physics.ins-det] 11 Nov 2016\\nkinematic directionality to define a beam. As a result, the efficiency of detection is directly\\nrelated to the solid angle subtended by the detector, placing high emphasis on having the source\\nas close to the detector as possible. In the case of IsoDAR this distance is a few meters from\\nthe detector surface (16.5 meters from the center of the KamLAND fiducial volume), in the case\\nof DAEδALUS the baseline is 20 km from the large water-Cherenkov counter (assumed to be\\nHyper-K). As the principal goals of these experiments is oscillation physics, the driving term is\\nL/E, the baseline distance divided by the neutrino energy. If E is low, the baseline L can also\\nbe low to preserve the same ratio. As a consequence, the 20 km baseline and 45 MeV average\\n¯νµ energy addresses the same oscillation point as the 1300 km, 3 GeV DUNE beam, or the 300\\nkm, 500 MeV T2K beam.\\nThe premise of these experiments is that relatively small and compact sources of neutrinos\\ncan be built and installed at the proper distances from existing or planned large water- or\\nliquid-scintillator-based neutrino detectors, providing access to the physics measurements with\\nsubstantially reduced costs.\\nWith respect to the long-baseline experiments (e.g.\\nT2K) the\\nbeamlines from the major accelerator centers operate much more efficiently and cleanly in the\\nneutrino mode, while the DAR measurements, utilizing IBD, address only the anti-neutrino\\nmode. Consequently, installing DAEδALUS cyclotrons at the proper distance from the long-\\nbaseline detectors, and operating the neutrino beams simultaneously, offers a huge improvement\\nin the sensitivity and data rates over the individual experiments. Discrimination of the source of\\nevents is straightforward, both from the energy deposition of events from each source, as well as\\nfrom timing: neutrinos from the cyclotrons are essentially continuous (up to 100% duty factor),\\nwhile those from the large accelerators are tightly pulsed with a very low overall duty factor.\\nNevertheless, the lack of directionality of DAR neutrinos, and the small solid angle between\\nsource and detector calls for the highest-possible flux from the source to ensure meaningful\\ndata rates. Available accelerator technologies and design configurations have been explored,\\nfor beam current performance, cost and footprint; we have arrived at the choice of compact\\ncyclotrons2. The only deficiency of this option is the average current. For appropriate data\\nrates, our specification is 10 mA of protons on target. This pushes the highest current from\\ncyclotrons by about a factor of 3,a and much of the accelerator development work of our group\\nto date has been devoted to addressing the factors that limit the maximum current in compact\\ncyclotrons3,4,5.\\nFigure 3 – Oscillations seen in KamLAND for a 5 year\\nIsoDAR run, for the global fit parameters still consistent\\nwith the IceCube analysis. IBD event rate is about 500\\nper day.\\nIn the next section the physics ratio-\\nnale for the IsoDAR and DAEδALUS exper-\\niments will be briefly described, while subse-\\nquent sections will address the configuration\\nof the cyclotrons, and progress made in push-\\ning the current limits from cyclotrons to the\\nrequired level. The IsoDAR target will be de-\\nscribed, capable of handling the 600 kW of\\nproton beams and optimized for 8Li produc-\\ntion. Finally, the IsoDAR experiment will be\\ncompared with other ongoing initiatives for\\nsearching for sterile neutrinos.\\n2\\nNeutrino Measurements\\n2.1\\nIsoDAR\\naIsotope-producing H−cyclotrons rarely reach 2 mA, the current record-holder for cyclotron current is the\\n3 mA PSI Injector 2, a 72 MeV separated-sector proton cyclotron injecting the 590 MeV Ring Cyclotron.\\nFigure 4 – Sensitivity of 5 year IsoDAR run compared to other ster-\\nile neutrino experiments. DANSS is a reactor experiment in Kalinin\\n(Russia)9;\\n144Ce and 51Cr are the SOX experiment at Borexino\\n(Gran Sasso, Italy)10, PROSPECT is a reactor experiment at HFIR\\nat ORNL (USA)11.\\nAnomalies in ¯νe disappearance rates\\nhave been observed in reactor and\\nradioactive source experiments6. Pos-\\ntulated to explain these has been the\\nexistence of one or more sterile neu-\\ntrinos, that do not in themselves in-\\nteract in the same manner as “ac-\\ntive” neutrinos (hence are called\\n“sterile”), however the active neutri-\\nnos can oscillate through these ster-\\nile states, and in this manner affect\\nthe ratio of appearance and disap-\\npearance from the known three fla-\\nvor eigenstates. Global fits7 of data\\nfrom experiments point to a mass\\nsplitting in the order of 1 to almost\\n8 eV 2, and a sin2(2 θ) of 0.1. Re-\\ncent analysis of IceCube data8, ex-\\nploiting a predicted resonance in the\\nMSW matrix for ¯νµ passing through\\nthe core of the earth appear to rule\\nout ∆m2 values of 1 eV 2 or below, however values above this energy are still possible.\\nThe very large ∆m2 imply a very short wavelength for the oscillations, in fact for the 8Li\\nneutrino it is measured in meters, so within the fiducial volume of KamLAND one could see\\nseveral full oscillations. Folding in the spatial and energy resolutions of the KamLAND detector\\n(12 cm/√EMeV ) and (6.4%/√EMeV ) respectively, the expected neutrino interaction pattern for\\nthe case of ∆m2 = 1.75 eV 2 is shown in Figure 3.\\nFigure 4 shows a sensitivity plot for IsoDAR, this experiment covers very well the regions of\\ninterest for sterile neutrinos.\\n2.2\\nLayout of DAEδALUS Experiment\\nSearch for CP violation in the lepton sector has been a high priority for many years. DAEδALUS\\ncombined with a long-baseline beam (e.g. T2K @ Hyper-K operating in neutrino mode only)\\ncan in 10 years cover almost all of the δ CP-violating phase angles.12\\nFigure 5 – Schematic of the two cyclotrons\\nin a DAEδALUS module.\\nThe injector\\n(DIC - DAEδALUS Injector Cyclotron) also\\nserves as the proton source for IsoDAR. The\\nDSRC (DAEδALUS Superconducting Ring\\nCyclotron) produces protons at 800 MeV.\\nThe experimental configuration includes three sta-\\ntions, each with identical targets that provide neutrino\\nsources (from stopped π+), one at 1.5 km (essentially\\nas close to the detector as feasible) that normalizes the\\nflux seen in the detector, one at 8 km that catches the\\nrise in the ¯νe appearance, and the principal station at\\n20 km, which measures the ¯νe appearance at the peak\\nof the oscillation curve. The absolute appearance am-\\nplitude is modulated by the CP-violating phase. The\\ncurrent on target, hence the neutrino flux, is adjusted\\nsequentially at each station (by “beam-on” timing) to\\nbe approximately equivalent to the flux from the long-\\nbaseline beam. The total timing cycle from all stations\\nallows approximately 40% of time when none are deliv-\\nering neutrinos, for background measurements.\\n3\\nCyclotron Configuration\\nFigure 5 shows schematically the basic configuration of a cyclotron “module” for DAEδALUS,\\nshowing the “chain” of injector-booster cyclotron with a top energy of 60 MeV, and the main\\nDAEδALUS superconducting ring cyclotron (DSRC) which delivers 800 MeV protons to the\\npion-production target. Note that the injector cyclotron is exactly the machine that is needed\\nfor the IsoDAR experiment, so developing this cyclotron is a direct step in the path towards\\nDAEδALUS.\\nTable 1: The most relevant parameters for the IsoDAR and DAEδALUS cyclotrons. IsoDAR has a single\\nstation with one cyclotron, DAEδALUS has three stations, at 1.5, 8, and 20 km from the detector. The\\nfirst two stations have a single cyclotron pair (DIC and DSRC), the 20 km station has two cyclotron pairs\\nfor higher power. Though the total power is high, because the targets are large and the beam is uniformly\\nspread over the target face, the power density is low enough to be handled by conventional engineering\\ndesigns. The DAEδALUS target has a long conical reentrant hole providing a very large surface area.\\nIsoDAR\\nDAEδALUS\\nParticle accelerated\\nH+\\n2\\nH+\\n2\\nMaximum energy\\n60 MeV/amu\\n800 MeV/amu\\nExtraction\\nSeptum\\nStripping\\nPeak beam current (H+\\n2 )\\n5 mA\\n5 mA\\nPeak beam current (proton)\\n10 mA\\n10 mA\\nNumber of stations\\n1\\n3\\nDuty factor\\n100%\\n15% - 50%\\n(time switching between 3 stations)\\nPeak beam power on target\\n600 kW\\n8 MW\\nPeak power density on target\\n2 kW/cm2\\n≈2 kW/cm2\\nAverage beam power on target\\n600 kW\\n1.2 to 4 MW\\nMaximum steel diameter\\n6.2 meters\\n14.5 meters\\nApproximate weight\\n450 tons\\n5000 tons\\nTable 1 lists high-level parameters for the IsoDAR and DAEδALUS cyclotrons. Note the\\npower implication of delivering 10 mA to the production targets.\\nThese very high power-\\nrequirements call for minimizing beam loss during the acceleration and transport process. Any\\nbeam loss is not only destructive of components, but also activates materials and greatly com-\\nplicates maintenance of accelerator systems. Some beam loss is unavoidable, however by appro-\\npriate use of cooled collimators and beam dumps, and by restricting as much as possible these\\nlosses to the lower energy regions of the cyclotrons, the thermal and activation damage can be\\nminimized.\\nThe single biggest innovation in these cyclotrons, aimed at increasing the maximum current,\\nis the use of H+\\n2 ions13 instead of protons or H−. As the biggest source of beam loss is space\\ncharge blowup at low energies, the lower q/A (2 protons for a single charge), and higher mass per\\nion (= 2 amu - atomic mass units) greatly reduces the effects of the repulsive forces of the very\\nhigh charge in a single bunch of accelerated beam. This helps keep the size of the accelerated\\nbunches down so there will be less beam lost on the inside of the cyclotron.\\nKeeping the\\nmolecular ion to the full energy also allows for stripping extraction at 800 MeV/amu, reducing\\nbeam loss in the extraction channels.\\nWhile the size and weight of these cyclotrons may appear large, there are examples of ma-\\nchines of comparable size that can serve as engineering models for beam dynamics, magnetic\\nfield design and costing. The PSI Injector 2, a 72-MeV 3-mA machine models some aspects of\\nthe IsoDAR cyclotron relating to the RF system and space-charge dominated beam dynamics14.\\nMagnet design and steel size/weight bear some similarities to IBA’s 235 MeV proton radiother-\\napy cyclotron15. The DSRC bears significant similarities to the superconducting ring cyclotron\\nat RIKEN16. While this cyclotron is designed for uranium beams, so the beam dynamics are\\nnot directly relevant, the cryostat and magnet designs are extremely close to the DAEδALUS\\nrequirements, and so serve as a good engineering and costing model for the DSRC.\\n4\\nIsoDAR developments\\nAs indicated above, efforts of our group have focused on producing high currents of H+\\n2 for\\ninjection into the IsoDAR cyclotron, modeling the capture and acceleration of these ions, and\\non the design of the target for handling 600 kW of proton beam and maximizing the production\\nof 8Li to generate the ¯νe flux delivered to KamLAND.\\n4.1\\nProducing High Currents of H+\\n2 for Injection\\nExperiments at the Best Cyclotron Systems, Inc. test stand in Vancouver, BC 3 tested the VIS\\nhigh-current proton source17 for its performance in generating H+\\n2 beams. Our requirement\\nfor H+\\n2 is a maximum of 50 mA of continuous beam from the source, which would provide an\\nadequate cushion in the event that capture into the cyclotron cannot be enhanced by efficient\\ntime-bunching of the beam (see next section). The VIS only produced about 15 mA of H+\\n2\\n(while we did measure 40 mA of protons); using this source would require efficient bunching. To\\nincrease our safety margin, a new ion source, labeled “MIST-1” has been built18 based on an\\nLBL-developed filament-driven, multicusp design19 which demonstrated a much more favorable\\np/H+\\n2 ratio, and currents in the range required. This source has been designed with a high\\ndegree of flexibility, to adjust geometric, magnetic field and plasma conditions to optimize H+\\n2\\nperformance. It is now being commissioned.\\n4.2\\nCapturing and Accelerating High Currents of H+\\n2\\nFigure 6 – Low energy injection line and central region of the DIC.\\nA short transport line connects the MIST-1 H+\\n2 ion source with the\\nRFQ buncher, which compresses the beam into packets of about\\n± 15◦. These packets are fed to the spiral inflector (photographed\\nin lower-right), electrostatic deflector plates that bend the beam into\\nthe plane of the cyclotron. The distance from the end of the RFQ\\nto the accelerating dees must be kept to a minium as there is energy\\nspread in the beam and long transport distances will cause the beam\\nto debunch. As a result the RFQ must be installed largely inside\\nthe steel of the cyclotron (pictured in upper right).\\nCyclotrons accelerate beam via RF\\n(radio-frequency, for our cyclotron\\naround 50 MHz) fields applied to\\nelectrodes (called “Dees”) extending\\nalong the full radial extent of the\\nbeam. Particles reaching the accel-\\nerating gap at the right phase of the\\nRF will receive a positive kick, while\\nthose arriving outside this phase an-\\ngle will be decelerated and lost. The\\nphase acceptance of the cyclotron\\nis typically about ± 15◦, so if the\\ninjected beam is not bunched lon-\\ngitudinally, only 10% of a continu-\\nous beam will be accepted.\\nHence\\nthe need for 50 mA of unbunched\\nbeam.\\nBunching is conventionally\\ndone with a double-gap RF cavity\\nplaced about one meter ahead of the\\ninjection point. Maximum efficiency\\nimprovement is no more than a fac-\\ntor of 2 or 3.\\nA novel bunching technique us-\\ning an RFQ was proposed many\\nyears ago20 that could in principle improve bunching efficiency to almost 85%. We have re-\\ncently been awarded funding from NSF to develop this technique, and are working with the\\noriginal proponent, and other key RFQ groups in the US and Europe to build and test this new\\nbuncher. Figure 6 shows schematically the central region of the cyclotron, including the MIST-1\\nsource, the RFQ, and spiral inflector that bunches and bends the beam into the plane of the\\ncyclotron.\\nOnce inflected into the plane of the cyclotron, the beam must be stably captured and ac-\\ncelerated to the full energy and extraction radius (of 2 meters in our case). In addition, there\\nmust be adequate turn separation at the outer radius to cleanly extract the beam. The parti-\\ncles experience 96 turns from injection to extraction, and the radial size of the beam must be\\ncontrolled so that a thin septum can be inserted between the 95th and 96th turns that will not\\nintercept any appreciable amount of beam. With a total of 600 kW, even a fraction of a percent\\nof beam lost on this septum can damage it.\\nFigure 7 – Configuration of IsoDAR on the\\nKamLAND site.\\nExtensive simulations, using the OPAL code21 de-\\nveloped at PSI specifically for beam-dynamics of highly\\nspace-charge-dominated beams in cyclotrons have been\\nused to show that this is possible, and to locate col-\\nlimators and scrapers in the first few turns to control\\nbeam halo (that would be intercepted on the extraction\\nseptum). This code has also shown that space-charge\\nforces can actually contribute to stability of the acceler-\\nating bunch by introducing a vortex motion within the\\nbunch that limits longitudinal and transverse growth of\\nthe bunch22.\\nThese developments give us confidence that the technical specifications for the IsoDAR\\ncyclotron can be met.\\n4.3\\nTarget design\\nThe configuration of the IsoDAR experiment is shown in Fig 7. The cyclotron is located in a\\nvault previously used for water purification, the target is located in one of the construction drifts\\nrepurposed as a control room that is no longer used.\\nFigure 8 – Target/sleeve/shielding structure. The target is 16.5 me-\\nters from the center of the KamLAND fiducial volume. Beam is bent\\n30◦to the target providing shielding for backstreaming neutrons. A\\nwobbler magnet spreads beam out on the 20 cm diameter target face.\\nThe target assembly can be pulled from the back of the structure into\\na casket. This hole is also shielded with removable concrete blocks.\\nThe shielding structure consists of steel and borated concrete.\\nBeam is extracted from the cy-\\nclotron and transported about 50\\nmeters to the target located close to\\nthe KamLAND detector. The 5 mA\\nof H+\\n2 is stripped in this transport\\nline, the resulting 10 mA of protons\\nare directed to the beryllium target.\\nBeryllium is a very efficient neutron\\nproducer, for the 60 MeV proton\\nbeam the yield is approximately 1\\nneutron per 10 protons. These neu-\\ntrons stream through to the sleeve\\nsurrounding the target, containing\\nsmall beryllium spheres (less than 1\\ncm diameter) surrounded by highly-\\nenriched 7Li (99.995%) . The sleeve\\nis a cylinder 50 cm in radius and 2\\nmeters long, and is surrounded by a\\n5 cm graphite reflector. Shielding outside the reflector consisting of iron and borated concrete\\nwhich contains the neutron flux to limit neutrons reaching the rock walls.\\nFig 8 shows the target, sleeve and shielding assembly in relation to the KamLAND detector.\\nThe 8Li yield from the moderated and captured neutrons varies with the fractional composition\\nof beryllium and lithium in the sleeve, the maximum is about 3% (8Li per incident proton on\\ntarget) for 30% (by weight) of lithium. This is close to the interstitial volume of tightly packed\\nspheres. All numbers are based on GEANT4 calculations23.\\nFigure 9 – Section through target and sleeve.\\nFig 9 shows the target assembly, a spun-cast beryl-\\nlium piece with the front surface (where the beam hits)\\nbeing 1.8 cm thick (range of protons is 2 cm, so Bragg\\npeak, at energy too low to efficiently produce neutrons,\\nis in the cooling water, reducing heat load in target.\\nA jet of heavy water is directed to the back surface of\\nthe target in a manner that effectively removes the 600\\nkW of beam power to a heat exchanger. The thermal\\nbehavior of the target is being modeled and will be ex-\\nperimentally tested in the future.\\n5\\nIsoDAR Compared with other Sterile Neu-\\ntrino Experiments\\nTable 2 compares the IsoDAR experiment with two\\nother sterile-neutrino search experiments, SOX10 and\\nDANSS9.\\nSensitivity comparisons were given in Figure 4, the table highlights some of the\\nrationale for the significantly higher sensitivity of IsoDAR.\\nTable 2: Comparison of IsoDAR with SOX, the 144Ce experiment at Borexino, and DANSS, a represen-\\ntative reactor experiment. Relative sensitivities of these three experiments were shown in Fig. 4\\n.\\nIsoDAR\\nSOX\\nDANSS\\nSOURCE\\n8Li\\n144Ce\\nFuel burning\\nSpectral purity\\nClean β spectrum\\nClean β spectrum\\ncomplex, with anomalies\\nRate stability\\nStable, dependent\\non accelerator\\nDecays with\\n285 day halflife\\nChanges with\\nfuel aging\\nEnergy of ¯νe\\nflux maximum\\n8.5 MeV\\n3.4 MeV\\n3.5 MeV\\nDETECTOR\\nKamLAND\\nBorexino\\nSolid scintillator\\nVolume\\n900 tons\\n100 tons\\n<10 tons\\nNeutron bkgnd\\nManageable\\nshield design\\nManageable\\nshield design\\nDifficult to shield, limits\\nproximity to core\\nCosmic bkgnd\\n(rock overburden)\\n2700 MWE\\n3400 MWE\\nshallow,\\nhigh muon rates\\nIn summary, IsoDAR is a very compelling experiment for the search for sterile neutrinos, but\\nbecause of the high event rates and excellent statistics, the reach of physics for this extremely\\nshort baseline configuration extends to non-standard interactions, spectral shape and other\\nneutrino-characterization experiments as well. The challenging technologies for producing the\\nhigh-power beams and optimizing neutrino production are being developed at a steady pace,\\never increasing the feasibility of these experiments.\\nAcknowledgments\\nWork supported by the US National Science Foundation under Grant No. NSF-PHY-1505858,\\nand by the MIT Bose Foundation.\\nReferences\\n1. A. Bungau, etal, Phys. Rev. Lett. 109, 141802 (2012)\\n2. A. Adelmann, etal, arXiv:1210.4454 [physics.acc-ph]\\n3. J.R. Alonso, etal, arXiv:1508:03850 [physics.acc-ph]\\n4. D. Winklehner, etal, arXiv:1507.07258 [physics-acc-ph]\\n5. J.J. Yang, etal, Nucl. Instrum. Methods A 704, 84 (2013)\\n6. G. Mention, etal, Phys. Rev. D 83, 073006 (2011)\\n7. C. Giunti, M. Laveder, Phys. Lett. B 706, 200 (2011), arXiv:1111.1069 [hep-ph]\\n8. G.H. Collin, C.A. Arg¨uelles, J.M Conrad, M.H. Shaevitz, Phys. Rev. Lett. (in press);\\narXiv:1607.00011 [hep-ph]\\n9. M. Danilov, arXiv:1412.0817 [physics.ins-det]\\n10. O. Smirnov, etal, Physics Procedia 61, 511 (2015)\\n11. J. Ashenfelter, etal, arXiv:1309.7647 [physics,ins-det]\\n12. C. Aberle, etal, arXiv:1307-2949 [physics.acc-ph]\\n13. L. Calabretta, etal, accelconf.web.cern.ch/AccelConf/p99/PAPERS/THP139.PDF\\n14. A.M. Kolano, etal, accelconf.web.cern.ch/AccelConf/IPAC2014/papers/tupri031.pdf\\n15. E. Syresin, etal, accelconf.web.cern.ch/AccelConf/IPAC2011/papers/weps085.pdf\\n16. K. Yamada, etal, accelconf.web.cern.ch/AccelConf/e08/papers/thpp069.pdf\\n17. L. Celona, etal, Rev. Sci. Instrum. 75, 1423 (2004)\\n18. S. Axani, etal, RSI 87, 02B704 (2016)\\n19. K.W. Ehlers, K-N. Leung, Rev. Sci. Instrum. 54, 677 (1983)\\n20. R.W. Hamm, etal, accelconf.web.cern.ch/AccelConf/c81/papers/ec-03.pdf\\n21. A. Adelmann, etal, accelconf.web.cern.ch/AccelConf/ICAP2009/papers/we3iopk01.pdf\\n22. J. Jonnerby, D. Winklehner (Private communications)\\n23. A. Bungau, etal, arXiv:1205,5790 [physics-acc-ph]\\n\\n\"}\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "\n", "FINAL ANSWER: none\n" ] } ], "execution_count": 40 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:35:29.620614Z", "start_time": "2025-05-24T21:35:13.131358Z" } }, "cell_type": "code", "source": [ "question = \"In April of 1977, who was the Prime Minister of the first place mentioned by name in the Book of Esther (in the New International Version)?\"\n", "messages = [HumanMessage(content=question)]\n", "messages = graph.invoke({\"messages\": messages})" ], "id": "837ae65dab259cd6", "outputs": [], "execution_count": 41 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:35:29.679887Z", "start_time": "2025-05-24T21:35:29.677840Z" } }, "cell_type": "code", "source": [ "for m in messages['messages']:\n", " m.pretty_print()" ], "id": "45de7688dadbe85", "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================\u001B[1m Human Message \u001B[0m=================================\n", "\n", "In April of 1977, who was the Prime Minister of the first place mentioned by name in the Book of Esther (in the New International Version)?\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "Tool Calls:\n", " web_search (6bf1f2de-2483-4772-8b41-f08cb78832ae)\n", " Call ID: 6bf1f2de-2483-4772-8b41-f08cb78832ae\n", " Args:\n", " query: first place mentioned by name in the Book of Esther New International Version\n", "=================================\u001B[1m Tool Message \u001B[0m=================================\n", "Name: web_search\n", "\n", "{\"web_results\": \"\\nsouthern reaches of Egypt and Sudan. The mention of these regions highlights the diversity and the strategic importance of the empire's territories. It also sets the stage for understanding the multicultural context in which Esther's story takes place. The vastness of the empire is a testament to the fulfillment of God's promise to Abraham that his descendants would be a blessing to all nations, as the Jewish people were dispersed throughout these regions.Persons / Places / Events1.Xerxes [...] place. The vastness of the empire is a testament to the fulfillment of God's promise to Abraham that his descendants would be a blessing to all nations, as the Jewish people were dispersed throughout these regions.Persons / Places / Events1. Xerxes (Ahasuerus)The king of Persia during the time of Esther. Known for his vast empire and significant influence, he is a central figure in the Book of Esther.2. 127 ProvincesThe extensive realm over which Xerxes ruled, stretching from India to Cush [...] strategic importance of the empire's territories. It also sets the stage for understanding the multicultural context in which Esther's story takes place. The vastness of the empire is a testament to the fulfillment of God's promise to Abraham that his descendants would be a blessing to all nations, as the Jewish people were dispersed throughout these regions.Persons / Places / Events1. Xerxes (Ahasuerus)The king of Persia during the time of Esther. Known for his vast empire and significant\\n\\n\\n---\\n\\n\\nSetting\\n\\nThe biblical Book of Esther is set in the Persian capital of Susa (Shushan) in the third year of the reign of the Persian king Ahasuerus. The name Ahasuerus is equivalent to Xerxes[4] (both deriving from the Persian Khshayārsha),[5] and Ahasuerus is usually identified in modern sources as Xerxes I,[6][7] who ruled between 486 and 465 BCE,[4] as it is to this monarch that the events described in Esther are thought to fit the most closely.[5][8] [...] The Book of Esther (Hebrew: מְגִלַּת אֶסְתֵּר, romanized: Megillat Ester; Greek: Ἐσθήρ; Latin: Liber Esther), also known in Hebrew as \\\"the Scroll\\\" (\\\"the Megillah\\\"), is a book in the third section (Ketuvim, כְּתוּבִים \\\"Writings\\\") of the Hebrew Bible. It is one of the Five Scrolls (Megillot) in the Hebrew Bible and later became part of the Christian Old Testament. The book relates the story of a Jewish woman in Persia, born as Hadassah but known as Esther, who becomes queen of Persia and thwarts [...] Assuming that Ahasuerus is indeed Xerxes I, the events described in Esther began around the years 483–482 BCE, and concluded in March 473 BCE.\\n\\nClassical sources such as Josephus, the Jewish commentary Esther Rabbah and the Christian theologian Bar Hebraeus,[9] as well as the Greek Septuagint translation of Esther, instead identify Ahasuerus as either Artaxerxes I (reigned 465 to 424 BCE) or Artaxerxes II (reigned 404 to 358 BCE).[9]\\n\\n\\n---\\n\\n\\nof the concubines.(AH) She would not return to the king unless he was pleased with her and summoned her by name.(AI) [...] The Greatness of Mordecai [...] By submitting your email address, you understand that you will receive email communications from Bible Gateway, a division of The Zondervan Corporation, 501 Nelson Pl, Nashville, TN 37214 USA, including commercial communications and messages from partners of Bible Gateway. You may unsubscribe from Bible Gateway’s emails at any time. If you have any questions, please review our Privacy Policy or email us at privacy@biblegateway.com.\\n\\nPreferences\\n\"}\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "Tool Calls:\n", " web_search (95e0465a-38e6-4e1a-9ddc-bfca9b70f2a3)\n", " Call ID: 95e0465a-38e6-4e1a-9ddc-bfca9b70f2a3\n", " Args:\n", " query: prime minister of India April 1977\n", "=================================\u001B[1m Tool Message \u001B[0m=================================\n", "Name: web_search\n", "\n", "{\"web_results\": \"\\n|\\n| Signature | |\\n| \\n|\\nMorarji Ranchhodji Desai (29 February 1896 – 10 April 1995) was an Indian politician and independence activist who served as the prime minister of India between 1977 and 1979 leading the government formed by the Janata Party. During his long career in politics, he held many important posts in government such as the chief minister of Bombay State, the home minister, the finance minister, and the deputy prime minister. [...] Appearance\\nmove to sidebar hide\\nFrom Wikipedia, the free encyclopedia\\nPrime Minister of India from 1977 to 1979\\n| \\nMorarji Desai\\n|\\n| --- |\\n| \\nDesai, c. 1978\\n|\\n| |\\n| Prime Minister of India |\\n| In office\\n24 March 1977 (1977-03-24) – 28 July 1979 (1979-07-28) |\\n| President | Neelam Sanjiva Reddy[Note 1] |\\n| Vice President | B. D. Jatti |\\n| Deputy | \\n\\nCharan Singh (24 January 1979 to 16 July 1979)\\nJagjivan Ram [...] umbrella of the Janata Party, and won the 1977 election. Desai was elected prime minister, and became the first non-Congress prime minister of India. Desai was the second and the last prime minister to have been born in the nineteenth century.\\n\\n\\n---\\n\\n\\nThe event of Morarji Desai unseating Indira Gandhi as Prime Minister of India in 1977 marked a significant turning point in Indian politics. Indira Gandhi, who had been a powerful leader since taking office in 1966, faced severe backlash following her controversial imposition of a state of emergency in 1975, which suspended civil rights and led to widespread arrests of her political opponents. After two years under emergency rule, she unexpectedly called for general elections, allowing the [...] Morarji Desai (1896-1995), prime minister of India, 1977-1979\\nIndira Gandhi (1917-1984), prime minister of India, 1966-1977 and 1980-1984\\nSanjay Gandhi (1946-1980), younger son of Indira Gandhi\\nJayaprakash Narayan (1902-1979), Indian freedom fighter and political leader\\nJawaharlal Nehru (1889-1964), prime minister of India, 1947-1964\\nMotilal Nehru (1861-1931), founder of India’s “Nehru dynasty”\\nRajiv Gandhi (1944-1991), elder son of Indira Gandhi and prime minister of India, 1984-1989 [...] politics. The 1977 election thus represented only a temporary halt to the rule of the Congress Party and the Nehru dynasty in India.\\n\\n\\n---\\n\\n\\nAn exclusive interview with Morarji Desai Morarj the 4th Prime Minister of India conducted shortly after his historic victory while he was\\n\"}\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "Tool Calls:\n", " web_search (2bb5841f-12ef-4523-a612-ad22203385c2)\n", " Call ID: 2bb5841f-12ef-4523-a612-ad22203385c2\n", " Args:\n", " query: first place mentioned by name in the Book of Esther NIV\n", "=================================\u001B[1m Tool Message \u001B[0m=================================\n", "Name: web_search\n", "\n", "{\"web_results\": \"\\n\\n---\\n\\n\\nsouthern reaches of Egypt and Sudan. The mention of these regions highlights the diversity and the strategic importance of the empire's territories. It also sets the stage for understanding the multicultural context in which Esther's story takes place. The vastness of the empire is a testament to the fulfillment of God's promise to Abraham that his descendants would be a blessing to all nations, as the Jewish people were dispersed throughout these regions.Persons / Places / Events1.Xerxes [...] place. The vastness of the empire is a testament to the fulfillment of God's promise to Abraham that his descendants would be a blessing to all nations, as the Jewish people were dispersed throughout these regions.Persons / Places / Events1. Xerxes (Ahasuerus)The king of Persia during the time of Esther. Known for his vast empire and significant influence, he is a central figure in the Book of Esther.2. 127 ProvincesThe extensive realm over which Xerxes ruled, stretching from India to Cush [...] XerxesThe book of Esther begins by setting the historical context during the reign of Xerxes I, also known as Ahasuerus in Hebrew. Xerxes was a Persian king who ruled from 486 to 465 BC. His reign is well-documented in both biblical and secular history, providing a backdrop for the events in Esther. Xerxes is known for his massive empire and his failed invasion of Greece, which is not mentioned in the Bible but is a significant historical event. The mention of \\\"the days of Xerxes\\\" situates the\\n\\n\\n---\\n\\n\\n2:14 In the evening she went, and on the morrow she returned into the second house of the women, to the custody of Shaashgaz, the king's chamberlain, which kept the concubines: she came in unto the king no more, except the king delighted in her, and that she were called by name. [...] 3:12 Then were the king's scribes called on the thirteenth day of the first month, and there was written according to all that Haman had commanded unto the king's lieutenants, and to the governors that were over every province, and to the rulers of every people of every province according to the writing thereof, and to every people after their language; in the name of king Ahasuerus was it written, and sealed with the king's ring.\\n\"}\n", "==================================\u001B[1m Ai Message \u001B[0m==================================\n", "\n", "FINAL ANSWER: Morarji Desai\n" ] } ], "execution_count": 42 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:35:29.861187Z", "start_time": "2025-05-24T21:35:29.858636Z" } }, "cell_type": "code", "source": [ "# check last message\n", "messages['messages'][-1].content" ], "id": "a41a87f74631e01b", "outputs": [ { "data": { "text/plain": [ "'FINAL ANSWER: Morarji Desai'" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 43 }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-24T21:35:30.016921Z", "start_time": "2025-05-24T21:35:30.015438Z" } }, "cell_type": "code", "source": "", "id": "8a939a3a079e10c7", "outputs": [], "execution_count": null }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-12T00:04:14.357997Z", "start_time": "2025-05-12T00:04:14.356039Z" } }, "cell_type": "code", "source": "", "id": "1c45e799f29d8f6c", "outputs": [], "execution_count": null }, { "metadata": { "ExecuteTime": { "end_time": "2025-05-12T00:04:14.387772Z", "start_time": "2025-05-12T00:04:14.385748Z" } }, "cell_type": "code", "source": "", "id": "8c837324c639ca67", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, "source": "", "id": "98a523f63a760c5d" } ], "metadata": { "kernelspec": { "display_name": "aiagent", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 5 }