Ad-hoc Mobile Ecosystem

5/08/2012

Ad-hoc camera sharing from Android to multiple parallel viewers on Android and Desktop -- part 2 of 2

As part of our ongoing "Ad-hoc Mobile Ecosystem" project we made a research on the “Ad-hoc camera sharing” use-case.

Due to our user centric concept, where we try to minimize switches in a technical context, we decided to integrate ad-hoc real time camera sharing from other mobile users into Augmented Reality view as a next building block.

This is part two of two.

Result of Reviews and Evaluation steps

(1) SipDroid based steps:

First iteration of our GitHub RTSP-Camera
running with H263-1998 packetizer from SipDroid
new implemented feature is to send out the same payload to a list of RTP/UDP clients in parallel
First implementation of our "Android embedded RTSP Video Server" implementation (part of GitHub RTSP-Camera project)

Result:

VLC is able to show the video stream with an estimated 1.5 seconds delay
Android Media API “VideoView” starts with a buffering of an incoming video stream which needs 10-12 seconds!
There are no configuration options or tricks to circumvent this!

This delay is not acceptable within our use-case.

(2) SpyDroid based steps:

Next step based on H264 encoding to evaluate VideoView delay
Integrated the H264 packetizer from SpyDroid
Adopted H264 packetizer to our multi-client RTSP Video Server RTP/UDP sender pattern

Results:

VideoView RTSP does not support H264 decoding
VLC needs detailed “sprop-parameter-sets” in SDP file to define SPS (Sequence Parameter Sets) & PPS (Picture Parameter Sets) according to level / profile / width / height / framerate.
Running VLC with debugging enabled is essential to understand the different errors

VideoLAN\VLC\vlc.exe --extraintf=http:logger --verbose=6 --file-logging --logfile=vlc-log.txt

For example, a missing or wrong “sprop-parameter-sets” in SDP ends up with continuous:

[112aded0] packetizer_h264 packetizer warning: waiting for SPS/PPS

(3) IP-Webcam based steps:

Next step based on MJPEG encoding and streaming through HTTP server
GitHub HTTP-Camera project

Results:

Straight forward implementation as an alternative to RTSP stream based approach, but with shortcomings regarding bandwidth and client support
HTML <image> tag in modern desktop browsers (Firefox, Chrome, …) do support streamed MJPEG rendering.
HTML <image> tag in Android WebView doesn’t support streamed MJPEG rendering!

(4) IMSDroid based steps:

Test a streaming session between two Android phones with local installed OpenSIP serverResult:

Delay is under one second

Results:

Hold back as a fall back if Orange code base will fail

(5a) Orange RCS/IMS H263 steps:

We preferred Orange Labs native encoding project over IMSDroid due to their maturity level, code structure and active maintenance.

Extracted the H263-2000 native encoder for RTSP-Camera
Adopted it to our multi-client RTSP Video Server RTP/UDP sender pattern
Extracted H263-2000 native decoder for RTSP-Viewer

Results:

Smooth streaming from VLC and our Android RTSP-Viewer app

(5b) Orange RCS/IMS H264 steps:

Extracted the H264 native encoder for RTSP-Camera
Adopted it to our multi-client RTSP Video Server RTP/UDP sender pattern
Extracted H264 native decoder for RTSP-Viewer

Results:

The main difference of H264 to H263 is the need for SPS/PPS information. There are two options for sending that parameters, named:

out band, which means either encoded in SDP “sprop” or at the very beginning of the stream
in band, which means with each key-frame (IDR)

The native H264 decoder from PV OpenCORE does not support SDP files, at least not with the JNI interface given from Orange. It relies instead of SPS/PPS embedded as NAL units within the stream.

To switch the default implemented “out band” parameters to “in band” was the result of a very deep debugging and learning session. Our use-case has to support “mid-stream” starting of decoder, which means that a client wants to join a running stream. This is different to Orange supported use-case of an initiated session between two partners to share their view. Last one works with one time SPS/PPS package (out band), our use-case failed.

Interesting enough the underlying PV OpenCORE code supports a parameter to switch exactly that handling, called “out_of_band_param_set”.

The obvious Internet research upfront ended with NULL sourcecode snippets which shows an example or project with the right usage of “in band” parameters.

To change the logic on our own of

encoding and continuous NAL unit SPS/PPS injection before each NAL IDR unit and
to bring decoding with in band SPS/PPS to life

was a two day work.

In the end we found the right encoder logic to built the right order of NAL units and we found a bug deep down in C++ decoding sources. This convinced us, that this was the reason for "NULL code snippets research", because we are the first which bring this use-case to life based on this library.

At the same time it becomes clear that Android Media APIs based on the same Stagefright / OpenCORE libraries are not able to support mid-stream decoding.

In our case VLC and Android RTSP-Viewer are now enabled to start rendering mid-stream of an H264 encoded stream.

Summary

We learned the pros & cons of all the different approaches through this project. Based on the lessons learned we are now able to estimate risk and effort for further steps accurate. Even more we are able to discuss architectures and evaluate other solutions now.

The confusion that is still around when mobile video-conferencing is discussed on the net, is mostly cleared for us now.

A critical element of our development approach was the very challenging low level debugging which was necessary for that project. For example, none of the extracted encoder solutions or TCP/UDP based communications worked instant. Every evaluation and coding step needed serious debugging and understanding of communication definitions and protocol stack.

"Ad-hoc Expertise" has been proven again as a vital part of our core competencies.

We solved every single encoder / decoder extraction and adoption to our RTSP Video Server multi-client framework. But to get there we had to learn following new technologies "on the fly":

RFCs (RTSP, H263/H264 over RTP, H263 and H264 de/encoding, ...),
TCP/UDP/RDP/H26x protocol analysis through Wireshark,
Native C++ libraries debugging based on Android NDK

Our working proof-of-concept code skeleton for the “Ad-hoc video sharing” use-case (with parts of the intermediate steps, like Android Media API based versus native JNI encoding/decoding) is provided for the benefit of all under GPLv3 on GitHub:

References

PV OpenCORE part till Android 2.0, https://github.com/android/platform_external_opencore
PV OpenCORE part of Android 4.x (only H264 decoder was switched to a codec provided by On2) http://androidxref.com/source/xref/frameworks/base/media/libstagefright/codecs/

Ad-hoc camera sharing from Android to multiple parallel viewers on Android and Desktop -- part 1 of 2

This is part one of two.

The camera sharing should be available on demand to multiple viewers in parallel, which leads us to a new "Android Embedded Video Server" capability based on Real Time Sharing Protocol (RTSP) with streaming over Real Time Protocol (RTP).

We have to make a clear differentiation between existing solutions for "Video Conference" and our new "Embedded Video Server" architecture, to clarify why the "Embedded Video Server" is the one, we decided to build our solution on in our given context of an "Ad-hoc Mobile Ecosystem".

Video Conference Architecture

Centralized video server which

receives multiple participant camera streams,
merges all streams into one video conference view
and delivers that merged video back to all participants.

"Call" Sharing pattern:

Additional conference partners need to be actively "Called" into the conference from the conference owner.

A typical implementation of that "Call" is the Session Initiation Protocol (SIP), which is an active coupling of two or more conference participants.
A typical implementation for stream delivery is Real Time Protocol (RTP).

Embedded Video Server Architecture

Decentralized video servers, embedded within each Android mobile as an App
"Pub/Sub" Sharing pattern :

Decoupled sender/client relationship.
A new client can subscribe to all published streams on demand, without interfering with other participants in that stream.

A typical provider for the Pub/Sub infrastructure is XMPP.
The implementation for a stream subscription is the Real Time Sharing Protocol (RTSP)
The implementation for stream delivery is Real Time Protocol (RTP).

Because we cannot find a ready to use Android feature nor an Open Source project ready to use, but a lot of uncertainty, questions and partial successes documented over the last years, we started to investigate existing solutions and to compose a solution and implement the missing links on our own.

We don’t want to point you straight to our solution on GitHub, but document instead our approach and the milestones to our solution.

We want to demonstrate, that without existing Open Source projects we wouldn't have a solution at all. But even with all the resources around we still need a very high level of knowledge and understanding for all the related technologies to fill up the missing links.

Nobody is able to have all that knowledge available in a small team of two. But an evaluation of following list of Enterprise level Open Source projects, based on a stack of protocols and standards will need such skills.

To built up “Ad-hoc Expertise” at former unknown topics (technical, standards, architecture, ...) is one essential part of our core competency which makes us a perfect partner for your demanding innovative projects.

Use-case summary

Ad-hoc camera sharing from Android (provider)

to multiple parallel on-demand viewers (consumers)

on Android Apps and PCs
based on a decoupled Publish & Subscribe pattern
between provider and consumers

Requirements

Quality

QCIF (176 x 144) and CIF (352 x 288) resolution
minimal delay for video playback (near realtime streaming)
support for WLAN bandwidth

Open Standards

H263 / H264 encoder and decoder for video streams
RTSP server for multiple parallel on-demand streaming support (http://www.ietf.org/rfc/rfc2326.txt)
Support for mid stream decoder start (technical background: continuous send of in band parameters SPS/PPS)
H263 and H264 over RTP/UDP

RTP Payload Format for H.263 Video Streams (http://www.ietf.org/rfc/rfc2190.txt)
RTP Payload Format for H.264 Video Streams (http://www.ietf.org/rfc/rfc3984.txt)

MJPEG over HTTP

Reuse of Open Source to minimize development time and risk

Android embedded Media API

Camera
MediaPlayer
MediaRecorder
VideoView
WebView

Desktop / Browser embedded media capabilities

HTML <image> tag, capable to consume and display MJPEG
HTML5 <video> tag, capable to consume and display RTSP streaming
VLC Media player (http://www.videolan.org/) which is Open Source itself and built with a strong support for video/audio codecs. Important for our scenario it is capable to consume and display RTSP streaming with different encodings.

Open Source review candidates from

Video conferencing / telephony
Web Cameras
PC and Android browser support for <image> and <video> tag

Research and Evaluation overview

Our main goal was to compose two functional apps. An Android RTSP-Camera and an Android RTSP-Viewer. Only essential code should be extracted from other projects to provide a working skeleton for further development. Following chronological list of projects displays the most important milestones of our research, evaluation and composition of our own solution.

(0) RTSP Server / Client

Real Time Streaming Protocol, or RTSP, is an application-level protocol for control over the delivery of data with real-time properties (http://www.ietf.org/rfc/rfc2326.txt)
RTSPClientLib (http://code.google.com/p/rtsplib-java/)

(1) SipDroid (http://code.google.com/p/sipdroid/)

Video encoding done through Android Media API: “MediaRecorder” (3GPP / H263-1998)
H263-1998 packetizer for RTP
Video decoding done through Android Media API: “VideoView” started with a RTSP:// url

(2) SpyDroid (http://code.google.com/p/spydroid-ipcamera/)

Video encoding done through Android Media API: “MediaRecorder” (3GPP / H264)
H264 packetizer for RTP
RTP wrapped up with Flash Video (FLV)
Video decoding done through Flash Player

(3) IP-Webcam (https://play.google.com/store/apps/details?id=com.pas.webcam)

Closed source but main idea is to send a HTTP based “multipart/x-mixed-replace” stream sending full frame JPEG which can be captured with “onPreviewFrame()” from Android Camera
with help of some additional resources a HTTP server, sending the right protocol was a simple task (http://www.damonkohler.com/2010/10/mjpeg-streaming-protocol.html)

(4) IMSDroid (http://code.google.com/p/imsdroid/)

Native encoder and decoder provided by a native JNI sub project
An IMS (IP Multimedia Subsystem) proof-of-concept spin-off from Doubango project (http://www.doubango.org/)
Native packetizer for RTP
Native renderer to Android Media API “VideoSurface”

(5) android-rcs-ims-stack (http://code.google.com/p/android-rcs-ims-stack/)

Part of RCS-e initiative (http://en.wikipedia.org/wiki/Rich_Communication_Suite#RCS-e)
provided and maintained by French Telco “Orange Labs”
Native encoder and decoder provided by a native JNI sub project
Native renderer to Android Media API “VideoSurface”
JNI sub project is directly based on Android internal implementation (PacketVideo OpenCORE, Stagefright), which is hidden from Java APIs. Orange made a small JNI wrapper for their very own use-cases.

Read on with part two in the next blog entry.

4/13/2012

User Centric Augmented Reality

Our MOBILE USER is challenged day to day, minute to minute on his path through a set of overlapping spheres. Totally different to his focused work in front of his Desktop PC, focus is shifting and changing with his movement. His different set of goals, tasks and attentions are changing priority in his context of activity, environment or culture.

So the main objective for mobile apps is:

Minimize technical context switches in an environment of continuous changing user contexts

User Centric Mobile Apps is our concept and the Ad-hoc Mobile Ecosystem is our implementation.

Our next step is seamless integration of a Tactical (2D) map-based view and a First Person View (3D) into a single User Centric Augmented Reality app with following requirements:

Minimize distraction through context switches with a unified AR view
Deeply situated navigation through decision relevant information
Natural support for concurrent events (calls, chat, navigation, search)

We do not limit AR to our visual senses, but also to tactile senses (vibrating) and hearing as well.

Evaluation of our AR project list was very helpful in more then one direction. Digging through code, wikis and discussions gave us a realistic impression what is possible today (mature / quick win)... and what is on the horizon (incubator) and worth monitoring.

First Person View (3D)

A set of Open Source solutions for Augmented Reality has been evaluated and tested. As a result, one of the most mature frameworks to support the above mentioned project plan is DroidAR.

It is setup as a library to build upon not a project tight to one use-case.
It contains a rich set of working demos to experience the library versatility.
Screencast series on YouTube.
Support for visualizing OpenGL 3D models, Pattern Recognition and building 3D Head Up Displays (HUD).

Tactical View (2D)

One of the main features of the Ad hoc Mobile Ecosystem project is to provide and share location-based information. Maps are made available through an Open Source WMS-Server.

gvSIG Mini is an appropriate WMS-Client for Android platforms and we worked with it successfully within the German Bundeswehr funded APP-6 Maker project (more on EC Joinup Portal).

Breaking news

4th of April gave us an important kick with next step of former *secret* project of Google[x] Labs smart glasses project, now called Project Glass. Read more and follow them on their G+ page. Beside all of our concepts and software, this hardware project seems to be the ideal fit within our vision.

3/30/2012

App Development is possible Anytime & Anywhere

Android Java IDE AIDE

This free Android App is a nugget of innovation and it is a true representative for the Mobility paradigm shift.

I took it for a test ride and it is indeed a fully functional Java IDE on your Android device. With an amazing developer experience, only known from heavy weight desktop IDE like Eclipse.

Imagine the option to fix or enhance your Apps in the field.

Do-it-yourself (DIY) with state of the art Android tablets running Quad-Cores and up to 18h battery operation (eg. Asus Transformer Prime + Keyboard dock).

Where is the limit? What is next?

Augmented Reality

Augmented Reality (AR) is one of the building blocks of our Ad-hoc Mobile Ecosystem.
Find below the list of Open Source candidates from Google Code, we have currently under evaluation.

Location based

AR Browser & Multimedia Interactive Tagging (Audio, Graphic, Text)

armp: Location based audio streaming
mezzofanti: AR through text-recognition and translation
realgraffiti: Location based graphic marking (tagging)
staaar: Location based text tagging and sharing

AR Browser

mixare: Location based AR
raveneye: Location based AR, support for POI and way-points
android-argame: A Mixare fork which tries to support AR multi-user scenarios

Marker Based

droidar: Basic functionality for marker based and location based AR
nfc-contextual-learning: Near Field Communication (NFC) Marker based AR

AR Libraries

armd2011: AR Game lectures at Vienna University of Technology, showcases the usage of central AR relevant Android libraries.
android-augment-reality-framework: A framework for creating marker based augmented reality Apps on Android
android-ar-base: Basic functionality for marker based AR (based on NyARTookit).
aruco-android: Ongoing Spanish project to establish an AR library

Agility and Speed are key factors

The Ad-hoc Mobile Ecosystem is specifically designed to support disaster relief scenarios and military operations and thus it is mission relevant software.

"We cannot retreat behind a Maginot Line of firewalls otherwise we will risk to being overrun. It is like maneuver warfare, all that matters is speed and agility."(Referring to a quote from William J. Lynn III [1])

What has changed since 1940 in Maginot?

Digital information has become a central asset in military conflicts.
This way software has become a critical resource for military operations.
There is an outer world, not necessary combatants, which learned how to utilizes digital information at a tremendous speed. In a conflict situation against us.

Speed and costs are the most critical metrics for development tasks referenced today.
With budget cuts proposed for the nearly all western defense communities, cost is the most urgent parameter.

But from an operational view, speed is named most often. We have to view at speed from more the one dimension:

Inner dimension: Speed of continuous change of requirements during development and way more important during life-cycle after deployment of the solution.

Outer dimension: High speed evolution of software outside the military realm and the need "not being overrun".

A compilation of commonly cited requirements with no sorting regarding their priority:

Increased Agility and Flexibility
Faster delivery
Increased Innovation
Reduced Risk
Increased Information Assurance & Security
Lower Costs

Next blog entry will focus how to respond to these requirements with Open Source Software (OSS).

Attribution

[1] Lynn, William J. III. Sep 2010. "Defending a New Domain: The Pentagon's Cyberstrategy". Foreign Affairs.

Mobile Computing is mission critical

Last decades have seen several major paradigm shifts in computing, such as personal computers, graphical user interfaces, the internet, the world wide web, laptops and wireless networking.

Each of these have intensively changed the way of personal computing. Mobile computing and its new concepts of information processing such as "anytime & anywhere" or "context-awareness" is a paradigm shift as big as any of the previous ones.

The military sector is facing the challenge of how to participate in this technological revolution. Few armed forces such as these of the US have already begun to respond to this challenge.

Today military conflicts are almost asymmetrical conflicts where armed forces meet opponents with low fire power on the one hand, but highly mobile and equipped with easy-to-find and powerful communication and information technology on the other hand.

Actually, the answer to the rapid pace of development in mobile communications and information technology is still a long-standing and inflexible procurement process. The result is that communications and information technology deployed to the troops is outdated by the time of its release, and increasingly left behind the technical capabilities of potential adversaries.

So to meet the challenges of future missions, a continuous deployment of up to date communications and information technology must be assured for armed forces such as the German Bundeswehr.