-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathaps.tex
965 lines (849 loc) · 47.8 KB
/
aps.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
\documentclass[11pt]{article}
\usepackage{iitbtitle}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{a4wide}
\usepackage{multirow}
\usepackage{anysize}
\usepackage{iitbcs}
\usepackage{latexsym}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{graphicx}
\usepackage{epsfig}
\usepackage{comment}
% \usepackage{psfig}
\usepackage{tabls}
\usepackage{multirow}
\usepackage{tabularx}
\usepackage{url}
\newcolumntype{Y}{>{\centering\arraybackslash}X}
\setlength{\floatsep}{7pt plus 2pt minus 2pt}
\setlength{\textfloatsep}{5pt plus 2pt minus 2pt}
\setlength{\intextsep}{5pt plus 2pt minus 2pt}
\usepackage{color}
\marginsize{2.5cm}{2.5cm}{1cm}{1.5cm}
\def\baselinestretch{1.15}
\newcommand\INPUT{\item[\textbf{Input:}]}
\newcommand\OUTPUT{\item[\textbf{Output:}]}
\providecommand{\norm}[1]{\lVert#1\rVert}
\DeclareGraphicsExtensions{.pdf,.png,.jpg}
\begin{document}
%\baselineskip 20pt
%The definitions
\def\title{Quadcopter Based Applications in Imaging}
\def\what{Third Annual Progress Report}
\def\degree{Doctor of Philosophy}
\def\who{Meghshyam G. Prasad}
\def\roll{124058001}
\def\guide{Prof. Sharat Chandran \\ Prof. Michael Brown}
\titlpage
\def\bsq{\begin{flushright} $\blacksquare$\\ \end{flushright}}
\def\tab{\hspace{5mm}}
\newpage
\section*{Acknowledgements}
I express deep and sincere gratitude to Prof. Sharat Chandran and Prof.
Michael Brown for providing constant direction and guidance. I am also
very grateful to Prof. David Hsu for giving me access to his lab and
equipments. I take this opportunity to thank Mr. Anirudh Vemula
for his help in Mosaicing project. I am indebted to him for his help in data
collection for the project. I am also thankful to Mr. Abhishek Chakrborty and
Mr. Ravi Chalasani for their help in stagnant water detection project.
\begin{flushright}
Meghshyam G. Prasad
\end{flushright}
\pagenumbering{roman}
\newpage
\begin{abstract}
In today's world, digital imaging is extensively used in many fields. In some
situations, it is quite difficult to take pictures from handheld camera. Low
cost quadcopters such as Parrot's AR Drone may be used in such scenarios to
take photos of an object which is otherwise out of reach of normal camera.
There are two problems involved: first, track the object and
second to take ``good'' pictures of that object. In my last APS we discussed
the first problem. In this report, we will see how to take good pictures and use
those pictures for various applications.
One of the scenarios where we may use quadcopter for imaging is to
capture a panorama of a big wall (or any such planar object). In such cases, it
will be very tedious and tiring to use a hand held camera. Secondly, if there
are spaces with very less ``features'' on a wall, it is challenging
for existing mosaicing techniques to construct panorama. In our work, we focus
on methods to construct panoramas from a quadcopter captured scenes with
significant regions of vacant spaces. We demonstrate the efficacy of our
approach on a number of input sequences that cannot be mosaiced by existing methods.
Though, using quadcopter for mosaicing is a challenging application, it has
restricted social use. Hence, we desired to use quadcopter for an imaging
application which may be used for a social cause. In recent times, there has
been a sharp increase in dengue and malaria, especially in urban areas. One of
the major reasons for this health hazard is the number of locations where one
can find stagnant water. These locations are large breeding grounds for fast
multiplying mosquitoes, and other insects. Areas include traditionally
uncovered gutters, and also terraces of high rise buildings, and shades above
windows (popularly known as ``chhajja'')-- areas that are hard to reach and
access. We propose the use of a quadcopter to inspect areas and identify
stagnant water patches using a novel technique based on optical flow.
\noindent \textbf{Keywords:} Quadcopter, Panoramic Image Stitching, Stagnant Water Detection
\end{abstract}
\newpage
\pagenumbering{arabic}
\tableofcontents
%\begingroup\def\thispagestyle#1{}\def\baselinestretch{1.5}\tableofcontents\endgroup
\newpage
\section{Motivation}
In the last APS, we have seen the design and detection algorithm of a blur
resilient fiducial for the quadcopter. Such a fiducial is required for tracking objects by
a quadcopter. After finding a possible solution for the tracking problem, we
used quadcopter for actual imaging and developing applications based on it.
The first application that we have developed is ``Panoramic Image Stitching''
through quadcopter. It will be very tedious to get panorama of a long wall like one
shown in Figure \ref{fig:wall}. So, we used quadcopter to ``cover'' such planar
objects. Then, using our image selection algorithm, we found out ``appropriate'' images
and then stitched those images to get final mosaic. Panoramic image stitching
becomes even more problematic when there are vacant spaces in the captured
scene as lack of features restrains homography based warping. We solved this
problem using approximate positional information acquired through quadcopter.
\begin{figure}[ht!]
\centering
\includegraphics[width=\linewidth]{IMG_20150415_155624.jpg}
\caption{A sample long wall.}
\label{fig:wall}
\end{figure}
The second application that we developed is Stagnant water identification
using quadcopter. We found many locations in our institute which do not attract
much attention but water gets easily accumulated around such areas (as shown in
Figure \ref{fig:puddle}). Such puddles are perfect locations for mosquitoes to
breed and spread deadly diseases like dengue. These puddles are formed mainly in
places such as terraces, window shades etc. Surveying such places manually is
very difficult. So, we used quadcopter to ``survey'' the campus and detect
places where water gets stagnated.
\begin{figure}[ht!]
\centering
\includegraphics[width=\linewidth]{IMG_20150302_160349385_HDR.jpg}
\caption{Sample water puddle in our institute.}
\label{fig:puddle}
\end{figure}
In summary, we have explored technical problems in the domain of computer
vision for novel applications using quadcopter.
\section{Mosaicing Scenes with Vacant Spaces}
Finding features and using them to align images to construct wide field
of view panoramas is one of the success stories of
computer vision. Virtually all recent consumer cameras have this
technology embedded. The success of these methods relies significantly on
finding common features in the images that can be used to establish the
appropriate warps to register the images together.
There are scenes, however, that consist of image content that makes this
challenging. One situation is when scene patterns and texture are repeated.
This can make it challenging for matching algorithm to find appropriate matches.
A more serious situation is when a scene area simply
does not contain features. This sort of situation occurs in a variety
of scenarios such as paintings in an art exhibition, or posters in an
auditorium.
{\bf Key Idea} We propose to solve the vacant space problem by using
an inexpensive off-the-shelf flying device, such as a quadcopter which
can be assumed to contain an inertial measurement unit (IMU) that has
positional information time synchronized with an input video. The
proximity relationship that the resultant images have, can be used to
significantly reduce the search space in finding matches. Further,
the proximity relationship also allows, in principle, to vary the
parameters involved in feature selection. For example, if there is
reason to believe that two images are adjacent horizontally, one can
choose to adjust thresholds in feature matching algorithm to hunt for
otherwise elusive matching pairs.
We note that positions can be also made available in other devices
such as smartphones. An autonomous programmed quadcopter, however, is
particularly enticing because of its ability to fly to areas that are
accessible to the human eye, but inaccessible for the
human to reach. Such areas do not lend themselves easily to high
quality images.
\subsection{Related Work}
Panoramic image stitching (alternatively, image mosaicing) is a
well-studied problem in the field of computer vision. Representative
works include~\cite{Milgram1975}, \cite{Milgram1977}, \cite{Capel},
\cite{Szeliski1997} \cite{Brown07} \cite{Brown03}. A full discussion
on related works is outside the scope of this report, readers are
referred to~\cite{Szeliski05imagealignment} for an excellent survey.
Given the maturity of this area, there are various freeware as well as
commercial software available for performing image stitching; most
notable are AutoStitch \cite{autostitch}, Microsoft\textsc{\char13}s Image
Compositing Editor \cite{ICE}, and Adobe\textsc{\char13}s Photoshop
\cite{photoshop}.
All of these methods are based on a similar strategy of finding
features in each image, matching these features between images, and
then computing pairwise image warps to align them together. A global
bundle adjustment is often applied to globally refine the alignment.
All of the aforementioned methods assume the imaged scene is planar or
that the camera has been rotated carefully around its center of
projection to avoid parallax.
Brown et al.~\cite{Brown05} have used a new type of invariant features
located at Harris corners in discrete scale-space and oriented using a
blurred local gradient for stitching. Eden et al.~\cite{Eden} were
able to stitch images with large exposure difference as well as large
scene motion into single HDR quality image without using any
additional camera hardware.
All of the image mosaicing methods work only when there is a
``intersection'' in feature space of images to be stitched. When there
are ``gaps'' (either physical or due to lack of features) between
images to be stitched it is not clear how to perform the stitching. We discuss
how to use the available IMU data that accompanies our input images to help
overcome these problems.
\subsection{Methodology}
Our goal is to compute a panorama of a scene lying on
planar surface that has regions of vacant spaces. A schematic for
this problem is shown in Figure~\ref{fig:schematic}.
\begin{figure}[h!]
\centering
\includegraphics[width=0.49\textwidth]{mosaicing/figures/indoor}
\includegraphics[width=0.49\textwidth]{mosaicing/figures/stereoOverlap}
\caption{ \label{fig:schematic} Problem definition. (Left) Vacant spaces
are encountered in various scenes. When individual portions are
captured by a quadcopter, how does one create the complete mosaic
given that common features are either not available, or
confusing?\\
(Right) Simplified reduction of the problem to a geometrical structure.
}
\end{figure}
The method adopted is pictorially depicted in the overview shown in
Figure~\ref{fig:workflow} and is described in detail later on. In
brief, we systematically acquire a video of the scene, reduce the large
number of images into a manageable number, and finally combine the
images acquired from different positions into a mosaic.
\begin{figure}[h!]
\centering
\includegraphics[width=\textwidth]{mosaicing/figures/Workflow}
\caption{ \label{fig:workflow} Overview: Input imagery is
systematically acquired (top left) by a quadcopter. In the next
step, interesting images are found by clustering the video into
regions based on positional data. A graph is constructed using
proximal images. For each connected component in a graph, standard
stitching techniques are used to create mini-panoramas which are
then joined together into super panorama
again using IMU data.}
\end{figure}
\noindent\textbf{Video acquisition}
We first despatch the quadcopter to as close to the scene as
possible. The corners of a rectangular area of interest is provided to
the quadcopter, and it is programmed to traverse the area in a snake
like raster scan fashion. There are various control aspects involved
in sending a quadcopter; in outdoor areas, the quadcopter is impacted
by wind and it might lose its way. The control aspects of the
quadcopter is discussed in Section \ref{sec:furtherWork}.
The quadcopter returns with a video of the scene. Any short video of
about a minute or more when given to a stitching program such as
Autostitch overwhelms the program, rendering it unusable. In the rest
of this section, we use Autostitch to indicate state of the art
stitching programs such as Autostitch, Photoshop, etc.
\noindent\textbf{Acquiring interesting images}
\label{sec:selection}
Our goal in this step is to reduce the amount of input data and
produce a set of interesting images. In other words, we wish to
convert a video into an album of images. The key difference between
our problem and standard albumization \cite{Aner, Lee} is the use of
positional information. A standard quadcopter has an inertial
measurement unit (IMU) that, after calibration, can give accurate
information of positions.
Using this it is possible to cluster the images, and sort the images
into an $m\times n$ grid. (Occasionally we have encountered outlier
situations and thus erroneous positional information. Thus we believe
standard vision-based clustering method can be used to either drop
frames, or to correctly include frames based on image content.) The
number of cluster centers is automatically determined using the
agglomerative bottom up hierarchical clustering method \cite{Lior}, with the
additional requirement that the whole scene (represented by the
positional data) is covered.
{\bf Clustering Details} We assume that each IMU data position
corresponds to an image of definite fixed dimensions. Consider each
position of the IMU data to be a leaf node. Two nodes are greedily
combined based on the closest Euclidean distance, and replaced with an
internal node. The position of the internal node is set to be the
centroid of the two nodes, and each internal node now corresponds to a
virtual image taken by a virtual quadcopter. The algorithm
recursively merges all the nodes till we end up with a root. In the
next phase, we proceed top-down and level by a level to find the
cluster centers. A set of nodes is considered for being the output as
cluster centers if the union of these nodes completely cover the
scene. From the bottom-up construction, it is clear that the root will
represent a single position, and thus a single image and most likely
will not cover the scene. At the other extreme, the set of all leaf
nodes \emph{will} cover the scene. The top-down algorithm proceeds
from root and finds the minimal set of nodes which cover the scene.
Once cluster centers are found, we pick the leaf node which is closest
to the cluster center to find a real image.
This process is pictorially described in
Figure~\ref{fig:selection}.
\begin{figure}[h!]
\centering
\includegraphics[width=\textwidth]{mosaicing/figures/selection}
\caption{ \label{fig:selection} Using IMU data, an image stream is
transformed into a set of interesting images for purpose of
panorama creation. }
\end{figure}
In practice, the number of cluster centers for the scenes we have
covered is now within the capacity of Autostitch. As mentioned in the
introduction, as long as there are sufficiently varying and
``matchable'' features, Autostitch is able to perform a reasonable
result. However, if there are very few features in overlapping region
of two images, then the output is not acceptable. This situation will
arise when there is vacant space between two pictures.
{\bf Time complexity} Autostitch has not been designed
to use positional information. As a result if there are $N$ input
images, the program has to consider possible matches in approximately
$O(N^2)$ set of areas. Our program is able to mosaic in an $O(N)$
fashion.
{\bf Spanning Trees}
Specifically, we assume at this point that the interesting photos are
available in the form of a $m \times n$ grid. First, we find SURF
\cite{Bay} features for each image in a grid. Next, we use Best of
Nearest Neighbor matcher (from the OpenCV library) with Random Sample
Consensus (RANSAC) \cite{Fischler1981} to find geometrically
consistent matches between neighborhood images inside grid. We
create a graph with images being nodes, and add an edge between two
nodes if there are sufficient matches. We have to recall at this point
that if there are ``vacant spaces'' there will not be enough features
for successful matches; the graph will end up with multiple
(disconnected) components. We next compute multiple spanning trees
for the various components. Given a spanning tree, the center of the
spanning tree is a node from which the distance to all other nodes is
minimal \cite{Kocay}. Next we calculate the homography of each image with respect
to spanning tree center. Finally, for each spanning tree, we stitch
all pictures using the computed homographies by warping all images
with reference to the image at spanning tree center. The spanning tree is
an $O(N)$ structure. The process is described in
Figure~\ref{fig:graph}.
\begin{figure}[h!]
\centering
\includegraphics[width=\textwidth]{mosaicing/figures/graphCreation}
\caption{ \label{fig:graph} Interesting images acquired are
segmented and individual (mini-panoramas) are constructed. These
are then later combined into the desired super-panoramas using IMU data.}
\end{figure}
\noindent\textbf{Super-panoramas}
In this section, we consider the situation when programs like
Autostitch fail. We assume that the output of the previous step has
resulted in multiple spanning trees where each spanning tree center
corresponds to a specific depth. This is the depth of the center of
the spanning tree, since we have stitched all images by taking the
spanning tree center as a reference. Individual panoramas for each
spanning tree termed mini-panoramas have been created. A
super-panorama must be created from mini-panoramas; these usually
correspond to different depths for at least two reasons.
First, it is invariably difficult, if not impossible, to control a
quadcopter to be at the exact depth even in indoor scenes. The
aerodynamics and the thrust produced tends to make the quadcopter
drift. Second, it might also be necessary to let the quadcopter probe
and come closer to the scene so as to get a ``good picture''.
A super-panorama is done using a two step process. Assume two trees in
the forest corresponding to area A and area B of the scene (see
Figure~\ref{fig:stereo}).
\begin{figure}[h!]
\centering
\includegraphics[width=0.45\textwidth]{mosaicing/figures/move}
\includegraphics[width=0.45\textwidth]{mosaicing/figures/stereo}
\caption{ \label{fig:stereo} (Left) The virtual picture as seen from position
B' is computed using Equation~\ref{eq:moveRelation} from the real picture
taken from B. (Right) Using the disparity, and the baseline width,
it is possible to depict the composite scene obtained from both A
and B' (from the view point of A).}
\end{figure}
Assume that a mini-panorama is created from these two areas, and the
depth of the planar surface from the camera is more for A, than for
B. We then take the mini-panorama image captured at B, and `move' it to
a new location B' whose depth (from the imaged surface) is the same as
that of A. The resulting image will be smaller; the images are
related by the equation
\begin{equation}
\label{eq:moveRelation}
{\bf \frac{x'}{x} = \frac{Z}{Z'}}
\end{equation}
where {\bf x} (respectively {\bf x'}) represents a pixel location of
the image in B (respectively, B') and
{\bf Z} (respectively {\bf Z'}) represents the depth of the images
surface from B (respectively, B').
We can now treat the resulting images from A (unchanged) and B’
(computed from Equation~\ref{eq:moveRelation}) forming a stereo pair.
Using the stereo disparity formula we can ``place,'' from the view
point of A, the image captured from B, thereby creating a
super-panorama.
\subsection{Experiments and Results}
\label{sec:results}
All our experiments have been completed with the inexpensive consumer
quadcopter called a Parrot's AR Drone 2.0. We have used the ROS based
ARDrone Autonomy Driver to communicate with the drone. For the purpose
of showing the efficacy of our method, we also took a picture of the
scene from a distance with a 5 mega-pixel camera to better understand
the scene.
We have implemented our algorithm in C++ using the OpenCV library
(OpenCV 2.4.9). Experiments were performed on a PC with Intel Core i7
processor(@3.4GHz) and 8GB RAM.
\noindent\textbf{Selecting Images}
In our first experiment, we wanted to ensure that the selection of
images done was comprehensive and useful. We sent the drone to image
an outdoor scene with no vacant space. This experiment was conducted
in an outdoor environment. We note here that there were approximately
3000 images in the raw video. Autostitch was unable to cope when fed
with this large number of frames.
One way to produce some sort of mosaic was to simply reduce the amount
of data given to Autostitch. Figure~\ref{fig:sac3}(a) shows uniformly
(time) sampled images from the video. When these sampled images are
given to Autostitch or to Adobe Photoshop, we find
(Figure~\ref{fig:sac3}(b)) that these programs are able to produce
some output, but the results are not satisfactory.
Instead of feeding time-sampled images, we ran our saliency selection
algorithm (as explained in Section~\ref{sec:selection}) on the video
which resulted in $N = 5$ images. Though the number of input images
in the video is large, the total distance covered by the quadcopter in
this duration (of around 90 seconds) is small; thus the
number of distinct images returned by the algorithm shows a dramatic
reduction. Figure~\ref{fig:sac3}(c) shows examples of selected images.
Many of the images are similar to the time sampled version; however,
the occasional differences are enough to make Autostitch work. The
results are shown in Figure~\ref{fig:sac3}(d).
\begin{figure*}[h!]
\centering
\includegraphics[width=0.87\linewidth]{mosaicing/results/ValidationResult}
\caption{ (a) Uniformly sampled images from an outdoor video
expedition. (b) Output of the state of the art photo stitchers
(left:Autostitch, right:Adobe Photoshop CS6) on uniformly time
sampled images. As time sampled images do not guarantee coverage of
the scene, the panorama is broken. The top portions do not belong at
the right place (see (d)) (c) Salient image selection from the set of
approximately 3000 images using positional information. (d) When
salient images are given to Autostich (left) and Photoshop (middle),
we can create a panoramic mosaic (since there are no vacant
spaces). We also show the result from our stitching algorithm
(bottom right).}
\label{fig:sac3}
\end{figure*}
We tested our salient image selection algorithm on another
input stream which had about 9000 images. The selection algorithm of Section
\ref{sec:selection} pruned the video into $N=15$ images. The scene as captured
by a typical camera can be seen in Figure~\ref{fig:validResults}(a). A sample
of the selected images are seen in Figure~\ref{fig:validResults}(b).
Figure~\ref{fig:validResults}(c,d,e) shows the comparison of outputs of the state
of the art stitching methods with the output of our algorithm. The input scene
is such that there are no vacant spaces; our stitching algorithm's output is
similar to that of Autostitch \cite{autostitch} as well as Photoshop
\cite{photoshop}. Results on this dataset again demonstrates that programs like
Autostitch, Photoshop can correctly stitch images selected by our algorithm
(when there are no vacant spaces).
\begin{figure}[h!]
\centering
\includegraphics[width=0.87\linewidth]{mosaicing/results/sac1}
\caption{ (a) An outdoor scene captured by a standard camera. (b) Salient image selection from the set of
approximately 9000 images using positional information. (c) Output of
Autostitch \cite{autostitch} on the images selected by our algorithm. (d) Output of Adobe Photoshop CS6
\cite{photoshop} on the selected images. (e) Output of our method.}
\label{fig:validResults}
\end{figure}
In summary, these experiments provide evidence to show that (a) our
saliency selection algorithm is reasonable and (b) our stitching
results are comparable to that of Autostitch for the kind of scenes
considered.
\noindent\textbf{Indoor Imagery with Vacant Spaces}
Our next selection of experiments were conducted in an indoor
environment. The input stream had about 4300 images. The selection algorithm
(Section~\ref{sec:selection}) pruned the video into $N=4$ images.
A sample of the selected images are seen in Figure~\ref{fig:teaser}.
\begin{figure}[t!]
\centering
\includegraphics[width=\textwidth]{mosaicing/figures/teaser.pdf}
\caption{ \label{fig:teaser} The long range photograph of a scene when probed
by a quadcopter results in the input images shown on the left. The
state of the art methods (middle column) are unable to make a mosaic because the
vacant space (third picture on the left) does not seem to have any
matchable features with subsequent input images.}
\end{figure}
There were two disconnected components in the resulting graph.
Autostitch was unable to produce any reasonable output as seen in
Figure~\ref{fig:teaser}. The scene, captured from a distance is also
shown. One can see a better orthographic view of the posters.
\vspace{0.5cm}
\noindent\textbf{Outdoor Imagery with Vacant Spaces}
Our next set of experiments were conducted in an outdoor
environment. The input stream had about 12000 images. The selection
algorithm pruned the video into $N=30$ images. A sample of the
selected images are seen in Figure~\ref{fig:results}. The scene as
captured by a smartphone can also be seen, as well as the comparison
of outputs of state of the art stitchers with output of our
algorithm. Figure~\ref{fig:results} shows the comparison of outputs of
state of the art stitchers with the output of our algorithm.
\begin{figure}[h!]
\centering
\includegraphics[width=0.86\linewidth]{mosaicing/results/results}
\caption{(a) An outdoor scene captured by a standard camera in an
exhibition. The approach to the area is normally cordoned off and one
needs permission to get a quadcopter to take the picture. Notice a
significant gap between the two posters. (b) Pruned images from the
quadcopter video using our saliency algorithm. (c) Output of
Autostitch on the selected images. The mosaic is not reasonable
presumably because of the confusion in features. (d) Output of Adobe
Photoshop CS6 on the selected images. The vacant space posed a
problem to the feature matching algorithm, so instead of a mosaic,
individual pieces were output as mini-panoramas (e) Our output on
the selected images. We are able to join two posters (separated by
vacant space) using IMU data.}
\label{fig:results}
\end{figure}
Next input stream had about 9100 images. The selection algorithm pruned the
video into $N=28$ images. The scene as captured by a smartphone can also be seen in
Figure~\ref{fig:results1}(a). A sample of the selected images are seen in
Figure~\ref{fig:results1}(b). Figure~\ref{fig:results1}(c,d,e) shows the
comparison of outputs of state of the art stitchers with the output of our
algorithm. As there are vacant spaces, Autostitch~\cite{autostitch} as well as
Photoshop~\cite{photoshop} are unable to stitch images accurately. In contrast,
since we use positional data, our output is an acceptable mosaic, and shows an
orthographic view.
\begin{figure}[h!]
\centering
\includegraphics[width=0.85\linewidth]{mosaicing/results/orange_blue.pdf}
\caption{(a) An outdoor scene captured by a standard camera in an
exhibition. The approach to the area is normally cordoned off and one
needs permission to get a quadcopter to take the picture. Notice a
significant gap between the two posters. (b) Pruned images from the
quadcopter video using our saliency algorithm. (c) Output of
Autostitch~\cite{autostitch} on the selected images. The mosaic is not
reasonable presumably because of the confusion in features. (d) Output of Adobe
Photoshop CS6~\cite{photoshop} on the selected images. The vacant space posed
a problem to the feature matching algorithm, so instead of a mosaic,
individual pieces were output as mini-panoramas (e) Our output on
the selected images. We are able to join two posters (separated by
vacant space) using IMU data.}
\label{fig:results1}
\end{figure}
The input stream had about 8500 images. The selection algorithm pruned the video
into $N=17$ images.The scene as captured by a smartphone can also be seen in
Figure~\ref{fig:results2}(a). A sample of the selected images are seen in
Figure~\ref{fig:results2}(b). Figure~\ref{fig:results2}(c,d,e) shows the
comparison of outputs of state of the art stitchers with the output of our
algorithm. As there are vacant spaces, Autostitch~\cite{autostitch} as well as
Photoshop~\cite{photoshop} are not able to stitch images accurately. In
contrast, since we use positional data, our output is an acceptable mosaic.
\begin{figure}[h!]
\centering
\includegraphics[width=0.85\linewidth]{mosaicing/results/Purple_red}
\caption{(a) An outdoor scene captured by a standard camera in an
exhibition. Notice a significant gap between the two posters. (b) Pruned
images from the quadcopter video using our saliency algorithm. (c) Output of
Autostitch~\cite{autostitch} on the selected images. The mosaic is not
reasonable presumably because of the confusion in features. (d) Output of Adobe
Photoshop CS6~\cite{photoshop} on the selected images. The vacant space posed a
problem to the feature matching algorithm, so instead of a mosaic,
individual pieces were output as mini-panoramas (e) Our output on
the selected images. We are able to join two posters (separated by
vacant space) using IMU data.}
\label{fig:results2}
\end{figure}
\section{Planned Work in Mosaicing through Quadcopter}
In this section we will see planned advancements in mosaicing through
quadcopter.
\subsection{Path Planning of Quadcopter}
\label{sec:furtherWork}
We encountered following problems while controlling quadcopter manually:
\begin{itemize}
\item As manual control is not smooth, the most of the captured images are
blurred.
\item Coverage (area covered in unit time) is not uniform.
\item We may not be able to cover desired area completely (and exactly) due to
manual errors.
\end{itemize}
So, the solution for these problems is planning the trajectory of the quadcopter
in order to ``cover'' the desired area. To achieve this, we have started
developing a tool based on PTAM (Parallel Tracking and Mapping).
Following work is done till date.
\textbf{UI :-} User may click points to define area he/she wants to cover on
planar region. Keypoints which are closest to clicked points are found and 2D convex
hull of those keypoints is shown to user for confirmation. User may add/remove
points; accordingly updated 2D convex hull will be shown.
\textbf{Path Planning :-} Once user confirms the area to cover, plane
equation is found using the keypoints inside 2D convex hull. Later, corners of
2D convex hull are projected on the calculated plane. Now, grid of overlapping
cells (fifty percent overlap in horizontal as well as vertical direction) is
formed inside the area formed by projected corners of convex hull, as shown in
Figure \ref{fig:pathPlanning}. Currently, width and height of each cell is fixed
but can be taken from user in future.
\begin{figure}[h!]
\centering
\includegraphics[width=\linewidth]{furtherWork/pathPlanning}
\caption{Schematic of path planning of quadcopter for single plane.}
\label{fig:pathPlanning}
\end{figure}
Basically, each cell in a grid represents area to covered in single frame.
Finally, camera extrinsics i.e., camera position is calculated for each cell in
the grid such that area in that cell will be covered in one frame. Accordingly,
quadcopter position is calculated for each calculated camera position.
Then, we maneuver quadcopter along the estimated trajectory and record
photographs at specified points.
\textbf{Further Work :-} We need to analyze the quality of images recorded at
specified points in order to find its usefulness for mosaicing. Accordingly we
may have to adapt path planning algorithm.
\subsection{Multi-planar Cases}
In some cases, our area to be covered is actually divided in multiple planes
which are parallel to each other, but at different depths (e.g., poster boards
kept at different depths). So, instead of fitting single plane, we need to fit multiple
planes and repeat process (from grid formation) for each plane.
There may be some case where multiple planes will be at certain angle with each
other. There might be two subcases: one in which quadcopter is surrounded by
multiple planes (Figure \ref{fig:pathInnerOuter}:Left); second in which
quadcopter has to surround multiple planes (Figure \ref{fig:pathInnerOuter}:Right).
In both cases, we will cover individual planes as in the base case. In the
first case, quadcopter will first rotate around its Z-axis before approaching
second plane. In the second case, quadcopter will move in circular trajectory
while moving from one plane to another plane. The control strategies for this
case is more complicated.
\begin{figure}[h!]
\centering
\includegraphics[width=0.45\linewidth]{furtherWork/pathInner}
\includegraphics[width=0.45\linewidth]{furtherWork/pathOuter}
\caption{Schematic of path planning of quadcopter for multi-planar
cases.\textbf{Left:} Quadcopter is surrounded by multiple planes.
\textbf{Right:} Quadcopter has to surround multiple planes.}
\label{fig:pathInnerOuter}
\end{figure}
\section{Stagnant Water Identification}
Though mosaicing scenes containing vacant spaces using quadcopter is
a challenging application, it has limited social use. Hence, we wished to use
quadcopter in an application which may help in a social cause. In this section, we
will see how quadcopter may be used to ``identify stagnant water'' around
residential places as stagnant water is the strong candidate behind outbreak of
deadly diseases like dengue and malaria.
Dengue~\cite{WHO15Dengue} is a troublesome debilating disease with no
known preventing vaccine, or cure. Doctors advocate that the best way
to avoid this disease is to avoid being bitten by mosquitoes which is
virtually an impossibility for many people in India. The greatest
risk of contracting dengue (pronounced DENgee) is in the Indian
subcontinent.
Technology is a must in tackling this situation. The risk of disease
can be reduced by using insect repellents. In our institute, the
common method has been the spraying of insecticides. Reports in the
media~\cite{china} indicate that China has flooded a small island
releasing half a million sterile mosquitoes to dominate the potent
mosquitoes. Such measures have unknown and unforeseen environmental
impact on the eco-system. Regardless, researchers are convinced that
there is no one single magic bullet to tackle the disease.
Our work, started prior to the announcement of ``Project
Premonition,'' is similar to that of \cite{Microsoft15}. Instead of
attempting to destroy the mosquito, we seek to detect the reason for
the increased outbreak, especially in urban areas. Our work complements that
of \cite{Microsoft15} --- the goal in \cite{Microsoft15} is to ``catch wild
mosquitoes'' (typically in the outfield) by creating novel mosquito traps, and
then to test mosquitoes for pathogens. New traps are placed by drones, and
retrieved by drones. In contrast, our work emphasizes the need for
identifying the location of these traps.
\begin{figure}[h!]
\centering
\includegraphics[width=\linewidth]{stagnantWater/figures/teaser.pdf}
\caption{(a) Our hovering quadcopter (b) A typical
rooftop on campus (c) Earlier method \cite{rankin2004daytime} applied, and
output marked in red (d) Our output. Notice (marked as black oval in
(c)) that \cite{rankin2004daytime} confuses non-puddle patches as puddle.
Also, note (marked as green ellipse in (d)) that \cite{rankin2004daytime} is
not able to detect puddle which are dark.}
\label{teaser}
\end{figure}
One of the major reasons behind the growth of mosquitoes is the
continuous existence of \emph{water puddles} around residencies in
urban India. The virus is carried by mosquitoes, and these breed in
stagnant water. \emph{Can we detect stagnant water?} When we surveyed
terraces (Fig.~\ref{teaser}) and ``chajjas'' of various buildings in
our institute, we found split AC (air conditioners) dumping condensed
water. Such areas can be surveyed using autonomous quadcopters.
\subsection{Related Work}
The method in ~\cite{santana12} for detection of
water relies on the chaotic nature of water's dynamic texture to
exploit a measure of entropy over the trajectories obtained from
optical flow trackers over several frames. \cite{zhang10} has
introduced a descriptor which is tolerant to the flip transformation
and even non-rigid distortions, such as ripple effects. \emph{These
methods and others in the literature focus on the turbulent aspects
of water}, largely absent in our application that focuses on
stagnant water. Our method is closest to that of Rankin et
al.~\cite{rankin2004daytime, rankin11} who have implemented a rule-based water
detector based on sky reflections. These rules are established
based on an analysis of images captured in wide-open areas on
cross-country terrain. Not only do we have new methods, \emph{our
datasets are captured in urban areas using a quadcopter and thus these
rules seem unlikely to be readily applicable.}
\subsection{Methodology}
Our method is based on a combination of color based method with an optical flow
based method. We establish the need for the combination in the first two subsections.
\vspace{0.5cm}
\noindent\textbf{Appearance-based Detection}
\begin{figure}[h!]
\centering
\includegraphics[width=0.4\linewidth]{stagnantWater/figures/IMG_PAIR_27_1}
\hfill
\includegraphics[width=0.4\linewidth]{stagnantWater/figures/IMG_PAIR_27_1_H}
\includegraphics[width=0.4\linewidth]{stagnantWater/figures/IMG_PAIR_27_1_S}
\hfill
\includegraphics[width=0.4\linewidth]{stagnantWater/figures/IMG_PAIR_27_1_V}
\caption{An image (top left) and the HSV components. The puddle has
low saturation (bottom left) but high intensity (bottom right).
The hue is indeterminable and is based on the environment.}
\label{fig:HSV}
\end{figure}
Under ambient lighting conditions, puddle areas display high
brightness and low saturation as can be seen in
Fig.~\ref{fig:HSV}. Features based on these are fed to an SVM
classifier. Training an SVM is a labour intensive task. To reduce the
effort, we have developed a tool shown in Fig.~\ref{fig:training}.
\begin{figure}[h!]
\centering
\includegraphics[width=0.9\linewidth]{stagnantWater/figures/trainingData.pdf}
\caption{The process for creation of training data. The user selects
the stagnant water area by drawing a contour to produce `positive'
and `negative' data.}
\label{fig:training}
\end{figure}
\textbf{Failure of SVM-based methods:} The SVM detector is good at
detecting regions of sky reflected off a puddle. In the HSV color
space, these regions have low saturation (S) and high brightness (V)
values, and are picked up with high reliability by the HSV histogram
feature based SVM detector. However, false negatives are also produced
since other reflected regions such as trees, buildings, etc. are
usually classified as non-puddle regions as many of these
characteristics is shared by negative images in the training data set.
\vspace{0.5cm}
\noindent\textbf{Optical flow based Detection}
Fortunately our images are captured by a moving quadcopter. The
optical flow measures apparent motion of objects in a scene caused by
relative motion between camera and object. The magnitude of the
optical flow is high for objects that are close in comparison to
objects at a distance. Fig.~\ref{fig:optical_flow} shows the
magnitude of optical flow calculated from two images.
\begin{figure}[h!]
\centering
\includegraphics[width=0.32\linewidth]{stagnantWater/figures/IMG_PAIR_27_1}
\hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/figures/IMG_PAIR_27_2}
\hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/figures/IMG_PAIR_27_optical_flow}
\caption{Optical Flow. Left, Middle: Frames taken from positions
which are $d$ units apart in 3D world. $ 0.01 \leq d \leq 0.1$. Right:
Magnitude of optical flow. We observe that the magnitude of optical
flow in the reflective parts of the puddle is relatively low.}
\label{fig:optical_flow}
\end{figure}
A recent thesis \cite{Liu11Thesis} has one of the state of the art
algorithm for optical flow. One requirement is the need for
spatio-temporal smoothness constraint which can be challenging because
of the jerky movement of the UAV. To resolve this, we use the
Inertial Measurement Unit (IMU) data available on the quadcopter to
synchronize positional information with the video sequence captured by
quadcopter. In short, we select the frame pair which are spatially the
closest, among a set of competing temporally adjacent frames.
\textbf{Failure of optical flow:} Optical flow is essentially being
used in a depth from parallax mode to exploit the fact that still
puddles behave like mirrors. The scenery reflected by such puddles is
usually at a much greater depth than the immediate surroundings of the
puddle. Optical flow is largely independent of hue and saturation.
For the same reason, however, optical flow as a means of detecting
puddles will fail to report true positives when the object that is
being reflected is close by. In such cases, the saturation and
intensity values are useful.
Yet another reason for the failure for the optical flow is the
inability to distinguish true ``far away'' regions versus imagined far
away regions due to the mirror-like properties of water. To handle
these false positives, we devise a horizon mask based on the principle
that water flows down.
\vspace{0.5cm}
\noindent\textbf{Combined approach}
\textbf{Horizon Mask:} The change in depth for far-away scenes as well
as their reflections on puddle, in consecutive frames, are hard to
distinguish. In previous work \cite{rankin11} such issues are avoided
by discarding a fixed-portion of image corresponding to far-off
regions, enforced by constrained input capture method. Due to
inapplicability of such constraints in the comparatively agile input
capture conditions of quadcopter, features derived from urban
environment are utilized for finding plausible puddle regions. In
urban setting, the high availability of structures in surroundings,
having distinctly flat surfaces and rectilinear silhouettes, enables
use of edge-detection based methods to bound planar regions that can
contain puddle. Applying the Hough transform with calibrated
parameters followed by length-based selection of lines detected, a
upper boundary for puddle region called `Horizon' is found. The
Horizon is in turn used to create a binary mask to be applied to local
scores from other techniques before normalization.
\begin{figure}[h!]
\centering
\includegraphics[width=\textwidth]{stagnantWater/figures/overall_workflow.pdf}
\caption{Overall architecture.}
\label{fig:workflow}
\end{figure}
These observations suggested a novel combined approach as sketched
in Fig.~\ref{fig:workflow}.
\subsection{Experiments and Results}
All our experiments have been completed with the inexpensive consumer
quadcopter called Parrot's AR Drone. We remark that one should not
compare expensive military grade drones with such inexpensive drones.
For the purpose of showing the efficacy of our method, we also took a
picture of the scene from a distance with a 5 mega-pixel camera for
the reader to better understand the scene. We covered urban places
such as terraces, constructions sites, building backyards,
pumping stations, and so on. We have captured around eight different
types of datasets, each having around two-three minutes of video
(around 3000 frames each).
\textbf{Comparison:} We compare our method to \cite{rankin2004daytime} on
representative images since the methods in \cite{rankin11} do not apply.
%See Fig.~\ref{fig:comparison}.
It can be seen from Fig.~\ref{fig:comparison} that in all images,
\cite{rankin2004daytime} is confusing bright patches as puddle patches (note
red patches on walls, sky etc. in middle column in Fig.~\ref{fig:comparison}).
Also, \cite{rankin2004daytime} is not able to detect textured puddle images (as
seen in the first row in Fig.~\ref{fig:comparison}). (The confidence in
detection is indicated by red hue in the output image. So, darker the red
tinge, higher the confidence in detected water region.)
\begin{figure}
\centering
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_73/IMG_PAIR_124_1} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_73/output_124_jpl2} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_73/output_124}
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_81/IMG_PAIR_1_1} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_81/output_1_jpl2} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_81/output_1}
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_82/IMG_PAIR_192_1} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_82/output_192_jpl2} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_82/output_192}
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_83/IMG_PAIR_130_1} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_83/output_130_jpl2} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_83/output_130}
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_63full/IMG_PAIR_102_1} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_63full/output_102_jpl2} \hfill
\includegraphics[width=0.32\linewidth]{stagnantWater/results/dataset_63full/output_102}
\caption{Comparison of proposed method with \cite{rankin2004daytime}. Confidence in detection is indicated by red hue in the output image.
So, darker the red tinge, higher the confidence in detected water regions. \textbf{Left:} Original Image,
\textbf{Middle:} Output of \cite{rankin2004daytime} \textbf{Right:}
Proposed method. It can be seen that \cite{rankin2004daytime} is unable to
detect textured puddle regions. Also, several false positives are
seen to appear in the middle row. Our method is sedate and
sufficient to alert health workers.}
\label{fig:comparison}
\end{figure}
\section{Further Work in Stagnant Water Identification}
Though, our results are better than the existing methods, the results can be improved and the scope
extended in the following directions:
\begin{itemize}
\item Determination of horizon from environmental cues could be improved and
made more robust in order to deal with factors such as illumination changes,
occlusions and unavailability of artifacts. Also, the parameters involved
could be automatically estimated through training.
\item The score obtained from optical flow magnitudes could be improved by
using the directional components separately, and combine them with
corresponding quadcopter motion vectors like velocity and acceleration.
\item The feature used for classifier to obtain the SVM score could be
modified to produce greater separation between puddle and non-puddle region
patches.
\end{itemize}
\section{Summary and Future Plan}
We have developed two novel imaging applications using quadcopter: Mosaicing
Scenes with Vacant Spaces and Stagnant Water Identification. Our results show
significant improvement over existing methods in respective applications.
We planned to work in path planning of quadcopter for mosaicing application to
improve quality of results. We have completed nearly half of the planned work,
while remaining work will be done shortly. We will be also extending the mosaicing
through quadcopter to multi-planar cases.
Though our current method for stagnant water identification is detecting water
puddles quite efficiently, we would like to increase the robustness of our
method by mainly improving upon horizon detection technique.
\newpage
\bibliographystyle{plain}
\bibliography{egbib}
\end{document}