Fawkes vs. RCSoftX
Joel Spolsky said: [The] single worst strategic mistake that any software company can make: The decided to rewrite the code from scratch.
We did it anyway and we believe that we had good reasons to do so. One of the major reasons was that new technologies emerged and better hardware was available than at the time when the old software was written. This allowed us to think in new ways and develop new paradigms than before. Also it made clear that there were shortcomings and problems in the old software not easy to solve without a major rewrite.
Of course there were also drawbacks of the rewrite: some tools have not been ported, yet. Also functionality was lost (like laser based localization) that yet has to be ported to the new system. One of the really big bonuses and major drawbacks of the new system is the tight integration. Communication ways are short, messaging is lightning-fast. But since everything runs in a single process one plugin for example producing a segfault crashes the whole software at once. Because of this high software quality and exploiting of all available debugging tools has become even more important than before.
The old software has been kept around and is still used in the RoboCup@Home league since it was available when we participated in that league. Up to now endeavor was started to switch to the new software for this league as not to interrupt the current development flow. At some point this might be a good way to start over and exploit the new features for easier and faster development, as we think this has been streamlined and made easier in the new software.
The following table is a simple comparison of different aspects that have been changed and that maybe influenced the decision or that became clear just after the rewrite when thinking a second or third time about the rewrite. For now to the questions if it was good to do the rewrite my answer is: Yes, definitely! Read on to get a clue why.
| Fawkes | RCSoftX |
| BlackBoard | |
| single-process access only (could be extended to allow for multi-process access, we decided against this feature for performance reasons) | multi-process access |
| read/write lock per interface | global mutex for all interfaces at once and for any operation |
| at most one writer per interface | any number of writers per interface, can cause problems if two processes overwrite each other's data |
| read/write operation is a small memory block memcpy | string search for each value of an interface and then one copy per value |
| well-designed interface generator, meant to be the only way to create an interface | many custom hand-written interfaces, interface generator was a hack |
| preparations to log every single bit transmitted over the BlackBoard | only data transmitted over the network can be logged |
| preparations for network access | network access with forward error correction |
| introspection can be added to interfaces | no introspection support |
| Network Transmission (WorldInfo) | |
| multicast peer-to-peer | only via central control host |
| forward error correction | |
| encrypted | world-readable |
| Aspects | |
| unified way to access Fawkes features | nothing comparable |
| initialization guarantees | |
| strong design patterns | |
| Networking (General Purpose) | |
| general purpose protocol (TCP) | nothing comparable |
| plugins can send data without caring about protocol basics | |
| integrated host and service discovery | |
| efficient, small, simple | |
| Config Subsystem | |
| SQLite database | XML file |
| automatic notification of changes | process needs restart after each change |
| remote editing in real-time | file must be edited directly |
| Geometry Library | |
| custom small geometry library | CGAL |
| developed with robotics in mind | General purpose heavy-weight lib |
| Utils | |
| well ordered | grown over the years |
| only stuff that is used today | old code lingering around |
| clear inter-lib hierarchy | unclear inclusion hierarchies |
| Time & Data | |
| centralized clock | each process uses system clock |
| simulation time can be plugged in easily | tough to get simulated time distributed |
| both:data between modules is transmitted via BlackBoard and can be read and logged in-between | |
| notification of data changes as-they-happen | no notification available |
| centralized world model, all mid- and high-level components use the very same data | world model not synchronized between processes, different modules may work on different data |
| each robot integrates the readings of local sensors and other robots' sensors | Data of all robot is merged on a central world model PC and then sent around |
| latency of < 50ms expected | high latency, measurements showed >200ms |
| Synchronization | |
| one central main loop to synchronize plugins | no synchronization, each module runs on its own |
| each datum processed only once on higher levels | data is likely to be processed multiple times over the full runtime |
| well-defined hook for threads | threads run "at-will" |
| Modules/Plugins | |
| only a few plugins, yet | many proven-to-somehow-work modules |
| some critical functionality yet missing | it worked and works for @Home |
| strong common design patterns (aspects) | loosely following common patterns |
| Build System | |
| small and fast | multiple times the code, long unecessary dependency generation step |
| easy dependency checking with graceful skipping | no compile-time dependency checks |
| prepared for parallel building | not prepared |
| easily extensible | fixed maximum number of targets |
| prepared for inclusion in other projects | hard-coded for RCSoftX |

