AI, Neurosciences and Human Biases

AI, Neurosciences and Human Biases

AI, Neurosciences and Human Biases

Introduction

The quest to replicate human intelligence has been a driving force behind the fields of neuroscience and artificial intelligence (AI). However, every attempt reflects not only technical prowess but also inherent human biases—cultural, cognitive, and epistemological. This article explores how neuroscience shaped AI, how human limitations impact machine learning, and why embracing diversity in cognition is key to the next stage of AI evolution.

1. Neurosciences as the matrix of AI

Neuroscience has provided the foundational metaphors and models for AI development. Early work by Changeux and Dehaene offered insights into how neurons communicate, inspiring the basic architectures of artificial neural networks. Friston’s free energy principle and Tononi’s Integrated Information Theory (IIT) propose complex ways of modeling brain activity mathematically, laying groundwork for AI systems that seek to emulate human prediction, consciousness, and self-organization. However, these models remain abstractions, not direct replicas of biological complexity.

2. From modeling to algorithmic mimicry

Artificial neural networks abstract biological neurons into mathematical nodes. This abstraction process introduces selection bias: humans choose which aspects of cognition to model based on what is easiest to quantify or reproduce. Logic, classification, and pattern recognition dominate, while phenomena like emotion, creativity, and bodily awareness are sidelined. Thus, AI systems mirror a narrow and culturally shaped view of intelligence.

3. AI and cognition: Towards simulating the brain

Deep learning, through multiple hidden layers, attempts to mimic hierarchical structures of human perception—such as the visual cortex processing shapes and patterns. Projects like the Human Brain Project and DeepMind’s AlphaFold illustrate the ambition to model cognitive processes at scale. Yet, such efforts often prompt philosophical inquiries: What counts as "thinking"? Can we separate intelligence from emotion, embodiment, or culture?

4. Anthropocentric bias and cognitive reduction

By modeling intelligence based on narrow human standards, AI systems risk reducing intelligence to a checklist: speed, logic, classification, prediction. Qualities like hesitation, intuition, and emotional intelligence—which are integral to real-world human decision-making—are often ignored because they are harder to model. This reduction not only narrows AI capabilities but also distorts our understanding of what true intelligence entails.

5. Cognitive ethnocentrism in AI design

AI datasets predominantly represent Western, urban, capitalist cultures. This results in systems optimized for those contexts, perpetuating biases against other cognitive traditions, such as communal decision-making or intuitive knowledge prevalent in non-Western societies. Ignoring cognitive diversity risks creating AI that is blind to a vast array of human and non-human intelligences.

6. AI as a mirror of our anthropocentric limits

Cases of algorithmic bias—in facial recognition, predictive policing, or hiring algorithms—highlight that AI systems often mirror and even amplify societal inequalities. The myth of "algorithmic objectivity" masks underlying issues: biased data, monocultural development teams, and a lack of epistemic humility. Recognizing these limitations is crucial to developing fairer and more inclusive AI.

7. Towards post-anthropocentric and neurodiverse AI

To advance AI, we must imagine intelligences beyond the human mold. Integrating animal cognition, plant networks, and even collective systems like bee colonies offers models of decentralized, adaptive intelligence. Including neurodiverse human data—such as autistic patterns of perception—could expand AI’s cognitive range. Reframing AI not as human replica but as creative otherness unlocks richer technological futures.

8. AI and Hype: Development, cycles and critiques

Definition and context

AI hype exaggerates capabilities and timelines, driven by media sensationalism, corporate marketing, and geopolitical competition. Announcements of "general AI" or "superintelligence" often obscure the genuine but incremental nature of current progress.

Origins and cycles

Historically, AI has experienced "boom and bust" cycles: surges of optimism (like expert systems in the 1980s) followed by "AI winters" of funding cuts and public disillusionment. Gartner’s Hype Cycle graphically depicts this rhythm.

Current reasons for hype

  • Fundraising leverage in startups and research labs.
  • Talent acquisition through brand prestige.
  • Geopolitical urgency in technology races.
  • Media's fascination with dystopian and utopian narratives.

Limits and risks

  • Overpromised capabilities can lead to public mistrust.
  • Ethical challenges: bias, disinformation, surveillance.
  • Potential for reduced investments after disillusionment phases.

Critical perspectives

Researchers like Crawford and Mitchell call for transparency, regulatory frameworks, and interdisciplinary approaches to ground AI progress ethically and socially.

9. Mathematical foundations: LLM, RAG, FT, NLP

LLM (Large Language Model)

Large Language Models like GPT are built on statistical methods: Gaussian distribution for uncertainty modeling, linear algebra for representing word embeddings, and optimization techniques for training complex networks efficiently.

RAG (Retrieval-Augmented Generation)

RAG systems enhance LLMs by retrieving external knowledge before generating responses, improving factual accuracy. They rely heavily on cosine similarity, dot products, and probabilistic matching to identify relevant context.

FT (Fine-Tuning)

Fine-tuning allows a pre-trained model to specialize on niche domains by adjusting its weights through additional training, utilizing constrained optimization and custom loss functions to balance accuracy with generalization.

NLP (Natural Language Processing)

NLP covers a wide range of techniques—tokenization, parsing, semantic analysis—underpinned by mathematics: probability theory, Bayesian inference, and vector space models for semantic similarity.

Conclusion

AI development is a mirror, revealing both human ingenuity and human limitations. To create truly transformative and ethical AI, we must move beyond ethnocentric paradigms, embrace cognitive diversity, and approach technological advancement with critical reflection and epistemic humility.

Sources

  • Crawford, K. Atlas of AI, Yale, 2021
  • Mitchell, M. Artificial Intelligence, Penguin, 2021
  • Floridi, L. & Chiriatti, M. GPT-3: Its Nature, Scope, Limits..., Minds and Machines, 2020
  • Gartner, Hype Cycle for Artificial Intelligence, 2023
  • The Economist, Le Monde, MIT Technology Review, The New York Times (2023)

Glossary of Key Terms Related to AI

A

  • Algorithm: A set of instructions designed to perform a specific task or solve a problem.
  • Artificial Intelligence (AI): The simulation of human intelligence processes by machines, especially computer systems.

B

  • Bias: Systematic errors in AI outputs caused by prejudiced training data or flawed assumptions in algorithms.
  • Big Data: Extremely large datasets that can be analyzed computationally to reveal patterns, trends, and associations.

C

  • Chatbot: A software application used to conduct online chat conversations via text or text-to-speech.
  • Computer Vision: A field of AI that trains computers to interpret and understand the visual world.

D

  • Data Mining: The process of discovering patterns and knowledge from large amounts of data.
  • Deep Learning: A subset of machine learning based on artificial neural networks with multiple layers.

E

  • Explainable AI (XAI): AI systems designed to be transparent and explain their decisions to human users.

F

  • Fine-tuning: Taking a pre-trained model and continuing training it on a new, usually smaller dataset for a specific task.

G

  • Generative AI: AI models that create new content, such as text, images, audio, or video, based on training data.
  • GPT (Generative Pre-trained Transformer): A type of large language model designed for natural language understanding and generation.

I

  • Inference: The process by which a trained AI model makes predictions or decisions based on new data.

L

  • Large Language Model (LLM): AI models trained on vast amounts of text data to understand and generate human language.
  • Latent Space: The abstract representation of compressed data inside a model, where similar concepts are closer together.

M

  • Machine Learning (ML): A subset of AI where machines learn from data without being explicitly programmed.
  • Model: A mathematical structure trained to recognize patterns or make decisions based on data.

N

  • Neural Network: A series of algorithms that attempt to recognize underlying relationships in data through a process that mimics the human brain.

O

  • Overfitting: When a model learns the training data too well, including noise and errors, and performs poorly on new data.

P

  • Prompt Engineering: Crafting input queries to get specific, useful outputs from AI models, especially language models.
  • Pre-training: Initial training phase where a model learns general features before being fine-tuned for a specific task.

R

  • Reinforcement Learning: A type of machine learning where agents learn by receiving rewards or penalties for their actions.
  • Robotics: The field of creating physical machines (robots) capable of performing tasks typically requiring human intelligence.

S

  • Supervised Learning: A type of machine learning where the model is trained on labeled data.
  • Self-supervised Learning: A technique where the model uses parts of the input data itself as labels, reducing the need for manually labeled datasets.

T

  • Tokenization: Breaking down text into smaller parts (tokens) like words or characters for processing by AI models.
  • Transfer Learning: Using knowledge gained while solving one problem and applying it to a different but related problem.

U

  • Unsupervised Learning: A type of machine learning that finds hidden patterns or intrinsic structures in input data without labeled responses.

V

  • Validation Set: A dataset used to tune model parameters and prevent overfitting during training.
  • Voice Recognition: AI technology that identifies and processes human voice input.

Mathematical and Statistical Theorems Applied to AI

1. Key Theorems

Theorem What It Is Why It's Important in AI
Law of Large Numbers As sample size increases, the sample mean converges to the population mean. Ensures that training with lots of data produces more reliable models.
Central Limit Theorem (CLT) The mean of a large number of independent random variables will be approximately normally distributed. Allows us to assume normal distributions for error terms and confidence intervals.
Bayes' Theorem Describes the probability of an event, based on prior knowledge of related conditions. Basis for probabilistic AI models like Naive Bayes and Bayesian Networks.
Markov's Inequality and Chebyshev’s Inequality Provide bounds on probabilities of large deviations from the mean. Used for setting limits in probabilistic models and anomaly detection.
No Free Lunch Theorem Averaged over all problems, no single algorithm performs better than another. Motivates selecting and tuning models carefully depending on the task.
Convergence Theorems (Optimization) Guarantees that methods like gradient descent approach an optimal point under certain conditions. Fundamental for training neural networks.
Universal Approximation Theorem A feedforward neural network with one hidden layer can approximate any continuous function. Justifies why deep learning works even with simple architectures.
Hoeffding’s Inequality Provides bounds on how much the empirical mean deviates from the expected mean. Important in understanding generalization error in machine learning.
PAC Learning (Probably Approximately Correct) Framework describing learnability under uncertainty. Guides how much data is needed to train a model to a desired accuracy.

2. How They Are Implemented in AI

  • Law of Large Numbers: More data leads to better generalization; outliers have less effect.
  • Central Limit Theorem: Loss functions like Mean Squared Error assume Gaussian errors.
  • Bayes' Theorem: Used in Naive Bayes classifiers and Bayesian Networks for updating beliefs.
  • Convergence Theorems: Optimization algorithms (like Gradient Descent, Adam) rely on them.
  • Universal Approximation: Guides deep neural network designs in TensorFlow, PyTorch, etc.
  • PAC Learning: Helps estimate how much data is required for desired model performance.

3. How They Work in AI

Example 1: Training a Neural Network

  • Convergence Theorem: Gradient Descent minimizes the loss function under right conditions.
  • Universal Approximation Theorem: Neural nets can approximate any complex pattern.

Example 2: Probabilistic Models (Bayesian Networks)

  • Bayes' Theorem: Models update predictions with new evidence: P(H|E) = (P(E|H) × P(H)) / P(E).

Example 3: Model Evaluation and Generalization

  • Hoeffding’s Inequality: Provides probabilistic guarantees on model performance on unseen data.
  • PAC Learning: Ensures with high probability that enough data leads to approximately correct predictions.

Quick Diagram: Relationships


Data Size ↑ → Law of Large Numbers → Better estimates

|

→ Model Training → Gradient Descent → Convergence Theorems

|

→ Model Evaluation → Hoeffding + CLT → Confidence bounds

|

→ Generalization → PAC Learning → Sample complexity

|

→ Bayesian Reasoning → Bayes' Theorem → Updating beliefs

|

→ Deep Learning Power → Universal Approximation

Lagrange and Gauss: Key Theorems for AI

Artificial Intelligence (AI), especially in its optimization and learning components, deeply relies on mathematical principles. Among the fundamental theorems, Lagrange’s Optimization and Gauss’s Error Minimization emerge as key pillars.

1. Lagrange: Optimization with Constraints

Lagrange multipliers provide a method to optimize functions under constraints — crucial for AI systems that must satisfy real-world limitations (e.g., resources, rules, physical limits).

Formal Principle

Given a function f(x, y, ...) to maximize/minimize, and a constraint g(x, y, ...)=0, the method of Lagrange multipliers introduces a new variable λ such that:


∇f = λ ∇g

Impact on AI

  • Training Neural Networks: Regularization terms (L2, L1) act as "soft constraints."
  • Reinforcement Learning: Agents maximize reward while obeying environmental limits.
  • Support Vector Machines (SVMs): Use Lagrangian duality to maximize margins under separability constraints.
  • Optimal Control in Robotics: Robots compute optimal actions under mechanical and energy limits.

Summary: Lagrange helps AI optimize behavior while respecting constraints.


2. Gauss: Error Minimization (Least Squares)

Carl Friedrich Gauss introduced the method of least squares — the foundation for error minimization in data fitting and machine learning.

Formal Principle

Given data points (xi, yi), find the model parameters that minimize the sum of squared errors:


minθ Σ (yᵢ - f(xᵢ, θ))²

Impact on AI

  • Linear Regression: Fundamental method based on least squares.
  • Backpropagation in Neural Networks: Minimizes loss functions using gradients — rooted in error minimization.
  • Kalman Filters: Predict states in uncertain environments by minimizing expected squared error.
  • Probabilistic Models: Assume errors are normally distributed — a legacy of Gauss's work on the normal distribution.

Summary: Gauss provides AI the mathematical tools to learn from noisy data by minimizing errors.


Why Together?

Lagrange = optimize what to do under constraints.
Gauss = optimize how to best match reality.

In advanced AI (like reinforcement learning, self-supervised learning, or probabilistic inference), both theorems intertwine:

  • You optimize actions (Lagrange)
  • While minimizing errors in the perception of the world (Gauss).

Visual Summary

Concept Lagrange Gauss
Purpose Optimize under constraints Minimize error
Mathematical Tool Lagrangian multipliers Least Squares, Normal Distribution
AI Applications SVMs, Reinforcement Learning, Control Regression, Neural Networks, Kalman Filter
Core Idea Obey limits while maximizing performance Best fit model to noisy data

By combining the strengths of Lagrange and Gauss, AI systems achieve both intelligent decision-making and resilient learning from imperfect data.

Summary

Mathematical and statistical theorems are the hidden skeletons that make AI models reliable, trainable, explainable, and powerful. Without them, we wouldn’t know how much data we need, why optimization works, or why models generalize to unseen data.

Comments

Popular posts from this blog

BIOMEDICAL ENGINEERING AND MAINTENANCE

European Intelligence: Theoretical Foundations and Strategic Challenges

EDA, CIRCULAR ECONOMY, STANDARDIZATION & DEFENSE CHALLENGES EN