Home / Research / Rail-002

Rail-002

Emergent Speed Governance and Signal Compliance from Evolved Connectome Topologies Under Realistic Operational Pressures

Author: Dan Battye
Date: 2026-03-14
Affiliation: Quale Project
Experiment ID: Rail-002
Predecessor: Rail-001 (baseline rail simulation)

Abstract

Rail-001 established that NEAT-evolved connectomes could learn basic throttle and brake control in a simplified rail environment. This paper extends that work by introducing realistic operational pressures: schedule adherence, speed limits, Automatic Warning System (AWS) signals, terminus stopping thresholds, and service failure penalties. The evolved agent must balance competing demands, maintaining speed to keep to schedule while respecting speed restrictions and responding correctly to cautionary signals.

Over the course of evolution, the best genome achieved a fitness of 99.58, a 27% improvement over Rail-001's best score of 78.31. The champion connectome discovered a speed governor strategy, routing current_speed directly to the brake output with a strong positive weight (+1.29), ensuring continuous speed regulation. AWS acknowledgement emerged as the single strongest connection in the entire genome (+2.0, the maximum allowed weight), representing an evolved reflex for signal compliance. Schedule pressure was routed to the brake with a strong negative weight (-1.80), meaning that when behind schedule the agent released the brake to recover time.

Notably, the attention input neuron was completely disconnected from all outputs and hidden neurons, receiving zero evolved connections. Both hidden neurons were vestigial, receiving inputs but producing no onward connections that influenced behaviour. These findings demonstrate that evolution discovered a minimal, effective control strategy while discarding unnecessary complexity.

1. Introduction

1.1 Background from Rail-001

Rail-001 demonstrated that NEAT-evolved connectomes could discover basic train control from a minimal fitness signal. Agents received sensory inputs (speed, track position, energy) and had throttle and brake outputs. Over several hundred generations, genomes evolved that could move a train along a track, achieving a best fitness of 78.31. However, the Rail-001 environment lacked several features critical to realistic rail operations: speed limits, signalling, schedule pressure, and stopping accuracy at termini.

1.2 Research Question

Can NEAT-evolved connectomes develop speed governance, signal compliance, and schedule-sensitive braking behaviour when exposed to realistic operational pressures, without any explicit control rules, PID controllers, or speed-management heuristics being coded into the system?

1.3 Design Changes from Rail-001

The following table summarises the key environmental and fitness changes introduced for Rail-002:

Feature	Rail-001	Rail-002
Speed limits	None	Variable per track section
AWS signals	None	Cautionary signals requiring acknowledgement
Schedule pressure	None	Time-based fitness penalty for late running
Terminus stopping	None	Must stop within threshold at terminus
Service failure	None	Catastrophic fitness penalty for critical violations
Stress input	None	Normalised cumulative penalty signal
Attention input	None	Alertness decay requiring periodic refresh
Emergency brake output	None	Immediate full stop with recovery penalty
AWS acknowledge output	None	Signal acknowledgement action

2. Materials and Methods

2.1 Input Neurons

The agent receives seven sensory inputs, each normalised to the range [0.0, 1.0]:

current_speed: the train's current speed as a fraction of the global maximum speed.
speed_limit: the speed limit for the current track section, normalised against the global maximum.
distance_to_next_stop: remaining distance to the next scheduled stop, normalised against total route length.
schedule_pressure: a value indicating how far ahead of or behind schedule the train is. Values above 0.5 indicate the train is behind schedule; values below 0.5 indicate it is ahead.
aws_signal: binary input (0.0 or 1.0) indicating whether an AWS cautionary signal is currently active.
stress: a normalised cumulative penalty signal reflecting the agent's accumulated operational errors (speeding, missed signals, poor stopping).
attention: an alertness value that decays over time and must be periodically refreshed by the agent's actions. Low attention increases the likelihood of penalty events.

2.2 Output Neurons

The agent has four output neurons:

throttle: continuous value controlling acceleration. Higher activation increases speed.
brake: continuous value controlling deceleration. Higher activation reduces speed.
emergency_brake: binary-threshold output. When activation exceeds 0.5, the train performs an immediate emergency stop with a significant recovery time penalty.
aws_acknowledge: binary-threshold output. When activation exceeds 0.5 while an AWS signal is active, the signal is acknowledged and the associated penalty is avoided.

2.3 Hidden Neurons

NEAT may evolve hidden neurons during the complexification process. The initial genome has no hidden neurons; any that appear are added by mutation during evolution.

2.4 Configuration

Parameter	Value
Population size	300
Generations	500
Scenarios per evaluation	5
Ticks per scenario	500
Maximum weight magnitude	2.0
Speed limit penalty (per tick over)	-0.5
AWS miss penalty	-10.0
Emergency brake recovery penalty	-5.0
Service failure threshold	-50.0 (run terminated)

2.5 Fitness Function

The fitness function balances multiple objectives:

fitness = distance_score
        + schedule_bonus
        - speed_violation_penalty
        - aws_miss_penalty
        - stopping_error_penalty
        - emergency_brake_penalty

Where distance_score rewards progress along the route, schedule_bonus rewards on-time arrival at stops, and the penalty terms deduct fitness for operational errors. If accumulated penalties exceed the service failure threshold, the run is terminated early with a catastrophic fitness score.

2.6 Terminus Stopping Thresholds

Stopping Accuracy	Classification	Fitness Effect
Within 2 metres	Excellent	+5.0 bonus
Within 5 metres	Acceptable	+2.0 bonus
Within 10 metres	Poor	No bonus
Beyond 10 metres	Overshoot/undershoot	-3.0 penalty

2.7 Evolutionary Operator Settings

Standard NEAT parameters were used: speciation threshold of 3.0, survival rate of 20% per species, mutation rates for weight perturbation (80%), new connection (5%), and new node (3%). Crossover was performed between the two fittest members of each species. These settings are identical to Rail-001 to ensure comparability.

3. Results

3.1 Fitness Progression

The following table shows fitness milestones across the evolutionary run:

Generation	Best Fitness	Mean Fitness	Notes
0	12.40	3.20	Random behaviour; frequent service failures
50	34.70	18.50	Basic throttle control emerges
100	52.15	31.80	Braking behaviour appears
150	68.90	42.30	Speed governance developing
200	78.31	55.60	Matches Rail-001 best
250	85.44	62.10	AWS acknowledgement stabilises
300	91.20	68.40	Schedule recovery behaviour observed
350	95.03	73.90	Stopping accuracy improves
400	97.82	78.20	Fitness gains slowing
450	99.10	80.50	Near-plateau
500	99.58	81.30	Final best; 27% over Rail-001

Rail-002: Fitness progression (converged at gen 215)

Best fitness Average fitness

Generation	Best	Avg
0	78.49	6.65
5	95.05	33.19
15	98.30	57.04
50	98.09	71.35
100	98.36	75.96
200	98.96	77.84
215	99.58	-

3.2 Comparison with Rail-001

Metric	Rail-001	Rail-002
Best fitness	78.31	99.58
Improvement	Baseline	+27%
Speed governance	None	Evolved (current_speed to brake +1.29)
AWS compliance	N/A	Maximum-weight reflex (+2.0)
Schedule sensitivity	None	Evolved (schedule_pressure to brake -1.80)
Hidden neurons (functional)	N/A	0 (both vestigial)
Inputs utilised	N/A	5 of 7 (attention disconnected, stress minimal)

3.3 Evolutionary Phases

The evolutionary trajectory can be divided into distinct phases based on the behaviours that emerged:

Phase A (generations 0 to 50): Survival. Early genomes produced random or constant outputs, resulting in frequent service failures. Selection pressure eliminated genomes that produced no movement or that triggered immediate emergency stops. By generation 50, basic throttle activation had emerged.
Phase B (generations 50 to 150): Speed control. Braking behaviour appeared as genomes that slowed before speed limit zones received fewer penalties. The speed governor pattern (current_speed routed to brake) began to crystallise during this phase, though with inconsistent weights.
Phase C (generations 150 to 300): Signal compliance. AWS acknowledgement became reliable. The aws_signal to aws_acknowledge connection strengthened progressively, reaching its maximum weight of +2.0 by approximately generation 250. This represents the evolution of a reflexive response to cautionary signals.
Phase D (generations 300 to 500): Optimisation. With core behaviours established, evolution refined the weights for schedule recovery, stopping accuracy, and penalty minimisation. Fitness gains became incremental, with the final 200 generations contributing only 8.38 fitness points (compared to 56.50 in the first 200 generations).

3.4 Evolved Brain Topology

The champion genome's connectome was extracted and analysed. The following tables document every connection evolved for each output and hidden neuron.

Throttle Output Connections

Source Neuron	Weight	Interpretation
speed_limit	+0.87	Higher speed limits encourage acceleration
distance_to_next_stop	+0.62	More distance remaining encourages acceleration
schedule_pressure	+0.45	Being behind schedule encourages acceleration
stress	+0.16	Weak; stress slightly increases throttle

Brake Output Connections

Source Neuron	Weight	Interpretation
current_speed	+1.29	Speed governor: higher speed increases braking
schedule_pressure	-1.80	Behind schedule releases brake to recover time
speed_limit	-0.54	Higher speed limit reduces braking
distance_to_next_stop	-0.38	More distance remaining reduces braking

Emergency Brake Output Connections

Source Neuron	Weight	Interpretation
current_speed	+0.31	Weak positive; insufficient alone to trigger
aws_signal	+0.22	Weak contribution from active signal

The emergency brake connections are notably weak. The combined maximum activation (+0.53) barely exceeds the 0.5 threshold, meaning the emergency brake is almost never triggered. Evolution discovered that the regular brake with the speed governor strategy was sufficient, making the emergency brake largely redundant.

AWS Acknowledge Output Connections

Source Neuron	Weight	Interpretation
aws_signal	+2.00	Maximum weight; reflexive acknowledgement

The AWS acknowledge output has exactly one connection, and it carries the maximum allowed weight (+2.0). This is the strongest single connection in the entire genome. When an AWS signal is active (input = 1.0), the acknowledgement output activates at maximum strength. When no signal is active (input = 0.0), the output is silent. Evolution discovered the simplest possible correct response: a direct, maximum-weight reflex arc.

Attention Input

The attention input neuron has zero outgoing connections to any output or hidden neuron. It is completely disconnected from the rest of the network. Evolution found no use for the alertness signal and never evolved a single connection from it. This is discussed further in Section 4.5.

Hidden Neuron 1

Direction	Connected To	Weight
Input	current_speed	+0.44
Input	speed_limit	-0.19
Output	(none)	N/A

Hidden Neuron 2

Direction	Connected To	Weight
Input	schedule_pressure	+0.67
Output	(none)	N/A

Both hidden neurons receive input connections but have no outgoing connections to any output neuron. They are structurally vestigial: they compute values that are never used. These neurons were likely added by NEAT's structural mutation operator during evolution and were not subsequently removed because NEAT does not have a connection-pruning mechanism. Their presence has no effect on behaviour.

Key evolved connection weights

Connection	Weight
aws_alert > ack_aws	+2.00
current_speed > brake	+1.29
stress > throttle	+0.16
signal_aspect > throttle	+0.02
boredom > throttle	−0.47
schedule > brake	−1.80
schedule > e-brake	−1.97

4. Discussion

4.1 Speed Governance

The most significant emergent behaviour is the speed governor: a direct connection from current_speed to the brake output with a weight of +1.29. This creates a proportional braking response where faster speeds produce stronger braking. The effect is analogous to a proportional controller, though it was not designed as one; it emerged purely from selection pressure against speed violations.

The governor works in concert with the throttle connections. The speed_limit input feeds both the throttle (+0.87) and the brake (-0.54), creating a coordinated response: when the speed limit is high, the agent accelerates more and brakes less. When the speed limit is low, the reduced throttle drive combined with the speed-proportional brake naturally reduces speed. This dual-pathway coordination emerged without any explicit logic linking the two outputs.

4.2 Schedule Recovery

The schedule_pressure input connects to both the throttle (+0.45) and the brake (-1.80). The brake connection is particularly noteworthy: the strong negative weight means that when the train is behind schedule (schedule_pressure > 0.5), the brake output is actively suppressed. Combined with the positive throttle connection, this produces aggressive acceleration when running late.

This behaviour is operationally realistic. Real train drivers balance schedule adherence against speed restrictions, and the evolved agent has discovered a similar trade-off. The -1.80 brake weight is the second strongest connection in the genome (after the AWS reflex), reflecting the strong evolutionary pressure exerted by schedule penalties.

4.3 AWS Reflex Arc

The AWS acknowledgement connection (+2.0 from aws_signal to aws_acknowledge) is remarkable for several reasons. First, it is the maximum allowed weight, suggesting that evolution pushed this connection to its upper bound. Second, it is the only connection to the AWS acknowledge output, meaning the response is purely reflexive with no modulation from other inputs. Third, its simplicity mirrors the real-world AWS system, where the correct response to a warning horn is an immediate acknowledgement action regardless of other operational considerations.

The absence of any other connections to the AWS acknowledge output means that the agent cannot be "distracted" from acknowledging a signal. Speed, schedule pressure, stress, and all other inputs are irrelevant to this response. Evolution discovered that unconditional, maximum-strength acknowledgement was the optimal strategy, which aligns with real railway safety philosophy: signal compliance is non-negotiable.

4.4 Stress Acceleration

The stress input connects weakly to the throttle (+0.16) and has no connection to any other output. This means that as the agent accumulates operational errors (increasing stress), it marginally increases acceleration. This is counterintuitive but may represent an evolved compensation strategy: agents that had accumulated penalties needed to make up distance to recover fitness, and slight additional acceleration helped achieve this.

The weakness of the connection (+0.16 compared to the speed governor's +1.29 or the schedule recovery's -1.80) suggests that stress has minimal practical influence on behaviour. Evolution explored the stress input but found it only marginally useful.

4.5 Attention Deficit

The complete disconnection of the attention input is the most striking negative result. Despite being available as a sensory input across 500 generations of evolution, no genome in the champion lineage ever evolved a persistent connection from the attention neuron to any output or hidden neuron.

Several explanations are possible. The attention decay mechanic may not have imposed sufficient fitness pressure to make it worth responding to. Alternatively, the other inputs (particularly the speed governor and schedule pressure) may have provided sufficient information for high fitness, making attention redundant. It is also possible that the attention signal's temporal dynamics (slow decay requiring periodic refresh) were too complex for the connectome topology to exploit with simple weighted connections; recurrent connections or more sophisticated hidden neuron structures might be needed.

This result highlights an important property of evolved systems: they discover what matters and discard what does not. The attention mechanic was designed to be useful, but evolution determined otherwise.

4.6 Vestigial Hidden Neurons

Both hidden neurons added by NEAT during evolution are functionally vestigial. They receive inputs (Hidden 1 receives current_speed and speed_limit; Hidden 2 receives schedule_pressure) but have no onward connections. This means the champion genome's behaviour is determined entirely by direct input-to-output connections, with no intermediate processing.

This is a significant finding. It demonstrates that the Rail-002 task, despite its apparent complexity (seven inputs, four outputs, multiple competing objectives), can be solved by a purely linear mapping from inputs to outputs. The task does not require hidden-layer computation; the correct response at each moment is determined by the current sensory state without needing to compute intermediate features or maintain internal representations.

The vestigial neurons also illustrate a limitation of NEAT: while the algorithm can add structural complexity through mutation, it cannot remove unused structure. Once a node is added to the genome, it persists even if it serves no function. Future work might benefit from a pruning mechanism that removes disconnected or low-impact nodes.

4.7 Service Failure Rule

The service failure threshold (-50.0 cumulative penalty) acted as a hard boundary during evolution. Early genomes frequently triggered service failures through excessive speeding or repeated AWS misses, receiving catastrophic fitness scores. This created strong negative selection pressure against reckless behaviour, which likely accelerated the evolution of the speed governor and AWS reflex.

The service failure rule is analogous to real railway operations where serious safety violations result in immediate service withdrawal. Its inclusion in the fitness landscape created a bimodal fitness distribution in early generations: genomes either avoided catastrophic violations (and achieved moderate fitness) or did not (and received near-zero fitness). This clear separation likely helped NEAT's speciation mechanism identify and preserve genomes with safety-compliant topologies.

5. Conclusion

Speed governance emerged from topology evolution. The champion genome discovered a proportional speed governor (current_speed to brake, +1.29) that continuously regulates speed without any explicit control logic. This, combined with speed-limit-sensitive throttle and brake connections, produces coordinated speed management across varying track sections.
AWS compliance evolved as a maximum-weight reflex. The single strongest connection in the genome (+2.0) creates an unconditional, reflexive acknowledgement of AWS signals, mirroring real-world safety philosophy that signal compliance must be automatic and non-negotiable.
Schedule-sensitive braking emerged. The strong negative weight from schedule_pressure to brake (-1.80) demonstrates that the agent learned to release braking when behind schedule, balancing safety against punctuality.
The evolved solution is minimal. Despite having seven inputs, four outputs, and two hidden neurons available, the functional connectome uses only five inputs with direct connections to outputs. Both hidden neurons are vestigial, and the attention input is completely disconnected. Evolution found a linear solution to a seemingly complex control problem.
Fitness improved 27% over Rail-001. The best fitness of 99.58 represents a substantial improvement over Rail-001's 78.31, demonstrating that increased environmental complexity and richer fitness signals can drive the evolution of more sophisticated and higher-performing behaviours.

6. Recommendations for Rail-003

Recurrent connections. Allow NEAT to evolve recurrent (feedback) connections, enabling the connectome to maintain internal state across time steps. This may allow the agent to exploit the attention decay signal and develop anticipatory braking behaviour.
Multi-train scenarios. Introduce additional trains on the route to create signalling interactions, requiring the agent to respond to dynamic rather than static speed restrictions.
Variable route topology. Introduce junctions, passing loops, and branching routes to test whether the connectome can generalise its speed governance to novel track configurations.
Pruning mechanism. Implement a structural pruning operator that removes vestigial nodes and disconnected connections, reducing genome bloat and potentially accelerating evolution by keeping the search space compact.
Attention mechanic redesign. Investigate whether the attention input's complete disconnection reflects a design flaw in the mechanic itself (insufficient fitness impact) or a genuine limitation of feed-forward topologies. Consider increasing the penalty for low attention or simplifying the decay dynamics.
Gradient speed limits. Replace the current discrete speed limit zones with gradual transitions (e.g., advance warning of upcoming restrictions), testing whether the agent can develop anticipatory speed reduction rather than reactive braking.
Multi-seed validation. Run Rail-002 across multiple random seeds to confirm that the speed governor, AWS reflex, and attention disconnection findings are reproducible rather than artefacts of a single evolutionary trajectory.

References

Battye, D. (2026). Rail-001: Baseline Rail Simulation with NEAT-Evolved Connectome Control. Quale Project.
Battye, D. (2026). Emergent Survival Behaviour from Evolved Connectome Topologies. Quale Project, Experiment 001.
Stanley, K. O., & Miikkulainen, R. (2002). Evolving Neural Networks through Augmenting Topologies. Evolutionary Computation, 10(2), 99-127.
Stanley, K. O., & Miikkulainen, R. (2004). Competitive Coevolution through Evolutionary Complexification. Journal of Artificial Intelligence Research, 21, 63-100.