Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webcam Demo #29

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Webcam Demo #29

wants to merge 3 commits into from

Conversation

asbroad
Copy link

@asbroad asbroad commented Jun 11, 2015

Hey Rob,

I added a demo of your system running a person detector on a live webcam feed. It requires Dlib to perform the selective search portion of the algorithm, and obviously a webcam. I also updated the readme to explain how to use the webcam demo. It's able to run at ~0.75 seconds per frame on a 2GB GPU and ~0.15 on a Titan X.

  • Alex

@cuatristapr
Copy link

Hello asbroad,
Can you send me the webcam demo file? I'm new to all of this and I'm trying to understand how will this work for real-time implementation on a Jetson TK1. Thanks!

@asbroad
Copy link
Author

asbroad commented Jun 24, 2015

Hi Christopher,

To run the webcam demo you need two things in addition to Rob's fast-rcnn, (1) the webcam.py file and (2) Dlib. To get the webcam file, you can get this pull request by either following these instructions (https://help.github.com/articles/checking-out-pull-requests-locally/) or, maybe more simply, by cloning my forked repository on my github page (the only differences are the webcam file and added information in the readme about how to run it). If you're having trouble installing Dlib, be sure to consult https://dlib.net/compile.html - but it's an extremely easy to use library.

Oh, and you'll also need a webcam, but I assume that is obvious :)

Hope that helps,

Alex

@cuatristapr
Copy link

I got the demo running, it works. I have a few questions, I still have no idea what does the scale factor actually does, and it's running like with 3-4 seconds delay on the Jetson TK1, but it's a start!

@asbroad
Copy link
Author

asbroad commented Jun 28, 2015

Hey Christopher,

Sorry for the delay in response, the scaling factor simply scales the webcam image. So if your webcam produces an 640x480 image, a scaling factor of 2 will produce an image scaled down by that factor in each dimension (i.e. the image that is processed will be 320x240). If you have any more specific questions, feel free to message me directly.

  • Alex

@arasharchor
Copy link

@asbroad
Hi!
I am going to make a live demo of Faster RCNN by using my laptops webcam. Is that possible to just pulling your changes in fast-rcnn make this demo run?
something like the second movie in here: https://pjreddie.com/darknet/yolo/
which is the real-time demo of "You only look once" paper.
I am interested to do so for Faster RCNN which is the fastest in the family of RCNNs.
Thanks in advance

@asbroad
Copy link
Author

asbroad commented Dec 14, 2015

Hey @smajida , I haven't looked at the Faster RCNN code much, but from what I remember about the paper, they replace the selective search region proposal method with another neural net (a region proposal net) so you shouldn't need to use any of the additions from my pull request (this was just a work around to not use matlab and to just take the image from a computer's webcam). Take a look at their github repo (https://github.com/rbgirshick/py-faster-rcnn) as it has a python implementation and a link to a matlab implementation - hope that helps and good luck!

@arushk1
Copy link

arushk1 commented Jan 14, 2016

So if I replace 'person' in your code to say 'dog'. It should detect it right?

@arasharchor
Copy link

yes as far as I know.

2016-01-14 11:09 GMT+01:00 arushk1 [email protected]:

So if I replace 'person' in your code to say 'dog'. It should detect it
right?


Reply to this email directly or view it on GitHub
#29 (comment).

@asbroad
Copy link
Author

asbroad commented Jan 14, 2016

@arushk1 yes, that is correct. the pascal voc 2007 dataset includes the following 20 classes

Person: person
Animal: bird, cat, cow, dog, horse, sheep
Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

@kesonyk
Copy link

kesonyk commented Apr 2, 2016

@cuatristapr
Hi I am now run the fast-rcnn on jetson tk1.But when I cd $FRCN_ROOT/lib do make
it appears 'arm-linux-gnueabihf-gcc' failed with exit status 1
do you have this problem

@kesonyk
Copy link

kesonyk commented Apr 2, 2016

I solve it

@kesonyk
Copy link

kesonyk commented Apr 2, 2016

@cuatristapr
Hi ,I also use jetson tk1 to do fasr-rcnn but the given pretrained net is too big ,do you have a small net ,thx

@tsaiJN
Copy link

tsaiJN commented May 10, 2016

@kesonyk
Hi, I'm also trying to run human detection on Jetson TK1. If I run it in cpu mode, it works perfectly fine (very slow though), but whenever I run it in gpu mode, it always get "killed". I also guess this is the gpu out of memory issue, have you manage to solve it ?

@miyamon11
Copy link

miyamon11 commented Oct 13, 2016

Hi, I'm trying to run your webcam.py, but I can not run.
I changed the caffemodel and rewrite your code. When I run this script, an error is happened as below
Traceback (most recent call last): File "realtime2.py", line 150, in <module> [im2, cls, dets, CONF_THRESH] = demo(net, frame, scale_factor, ('person',)) File "realtime2.py", line 87, in demo scores, boxes = im_detect(net, im2, obj_proposals) File "/home/keisan/py-faster-rcnn/tools/../lib/fast_rcnn/test.py", line 154, in im_detect blobs_out = net.forward(**forward_kwargs) File "/home/keisan/py-faster-rcnn/tools/../caffe-fast-rcnn/python/caffe/pycaffe.py", line 97, in _Net_forward raise Exception('Input blob arguments do not match net inputs.') Exception: Input blob arguments do not match net inputs.

I can not understand what is happened. What should I do?

The rewrited code is blow

NETS = {'vgg16': ('VGG16', 'VGG16_faster_rcnn_final.caffemodel'), 'zf': ('ZF', 'ZF_faster_rcnn_final.caffemodel')}

and

prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0], 
                        'faster_rcnn_alt_opt', 'faster_rcnn_test.pt')
caffemodel = os.path.join(cfg.DATA_DIR, 'faster_rcnn_models',
                          NETS[args.demo_net][1])`

@takemegod
Copy link

it's look like file connect problem if you use Japanese or not English directory

maybe happed like folder file not find problem. and what about change folder file directory short?

exception bug will be fix.


보낸 사람: miyamon11 [email protected]
보낸 날짜: 2016년 10월 13일 목요일 오전 8:40
받는 사람: rbgirshick/fast-rcnn
제목: Re: [rbgirshick/fast-rcnn] Webcam Demo (#29)

Hi, I'm trying to run your webcam.py, but I can not run.
I changed the caffemodel and rewrite your code. When I run this script, an error is happened as below
Traceback (most recent call last):
File "realtime2.py", line 150, in
[im2, cls, dets, CONF_THRESH] = demo(net, frame, scale_factor, ('person',))
File "realtime2.py", line 87, in demo
scores, boxes = im_detect(net, im2, obj_proposals)
File "/home/keisan/py-faster-rcnn/tools/../lib/fast_rcnn/test.py", line 154, in im_detect
blobs_out = net.forward(**forward_kwargs)
File "/home/keisan/py-faster-rcnn/tools/../caffe-fast-rcnn/python/caffe/pycaffe.py", line 97, in _Net_forward
raise Exception('Input blob arguments do not match net inputs.')
Exception: Input blob arguments do not match net inputs.

I can not understand what is happened. What should I do?

The rewrited code is blow

`#!/usr/bin/env python

-- coding: utf-8 --

Fast R-CNN
Copyright (c) 2015 Microsoft
Licensed under The MIT License [see LICENSE for details]

Written by Ross Girshick

"""
Demo script showing detections in sample images.
See README.md for installation instructions before running.
"""

import _init_paths
from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms#ここが違う元はfaster_rcnn.nms_wrapper
from utils.timer import Timer
import matplotlib.pyplot as plt
import numpy as np
import scipy.io as sio
import caffe, os, cv2 #sysがない
import argparse
import dlib #新しいモジュール

CLASSES = ('background',
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat', 'chair',
'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor')

NETS = {'vgg16': ('VGG16',
'VGG16_faster_rcnn_final.caffemodel'),
'zf': ('ZF',
'ZF_faster_rcnn_final.caffemodel')}

def vis_detections(im, class_name, dets, thresh=0.5):
"""Draw detected bounding boxes.この定義はすべてdemo.pyと一緒"""
inds = np.where(dets[:, -1] >= thresh)[0]
if len(inds) == 0:
return

im = im[:, :, (2, 1, 0)]
fig, ax = plt.subplots(figsize=(12, 12))
ax.imshow(im, aspect='equal')
for i in inds:
bbox = dets[i, :4]
score = dets[i, -1]

ax.add_patch(
    plt.Rectangle((bbox[0], bbox[1]),
                  bbox[2] - bbox[0],
                  bbox[3] - bbox[1], fill=False,
                  edgecolor='red', linewidth=3.5)
    )
ax.text(bbox[0], bbox[1] - 2,
        '{:s} {:.3f}'.format(class_name, score),
        bbox=dict(facecolor='blue', alpha=0.5),
        fontsize=14, color='white')

ax.set_title(('{} detections with '
'p({} | box) >= {:.1f}').format(class_name, class_name,
thresh),
fontsize=14)
plt.axis('off')
plt.tight_layout()
plt.draw()

def demo(net, im, scale_factor, classes):
"""Detect object classes in an image using pre-computed object proposals.
       ここの部分はまったく違う"""

im2 = cv2.resize(im, (0,0), fx=1.0/scale_factor, fy=1.0/scale_factor)

obj_proposals_in = []
dlib.find_candidate_object_locations(im2, obj_proposals_in, min_size=70)

obj_proposals = np.empty((len(obj_proposals_in),4))
for idx in range(len(obj_proposals_in)):
obj_proposals[idx] = [obj_proposals_in[idx].left(), obj_proposals_in[idx].top(), obj_proposals_in[idx].right(), obj_proposals_in[idx].bottom()]

Detect all object classes and regress object bounds

scores, boxes = im_detect(net, im2, obj_proposals)

Visualize detections for each class

CONF_THRESH = 0.8
NMS_THRESH = 0.3
for cls in classes:
cls_ind = CLASSES.index(cls)
cls_boxes = boxes[:, 4_cls_ind:4_(cls_ind + 1)]
cls_scores = scores[:, cls_ind]
dets = np.hstack((cls_boxes,
cls_scores[:, np.newaxis])).astype(np.float32)
keep = nms(dets, NMS_THRESH)
dets = dets[keep, :]

return [im2, cls, dets, CONF_THRESH]

def parse_args():
"""Parse input arguments.ここは同じ"""
parser = argparse.ArgumentParser(description='Train a Fast R-CNN network')
parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',
default=0, type=int)
parser.add_argument('--cpu', dest='cpu_mode',
help='Use CPU mode (overrides --gpu)',
action='store_true')
parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16]',
choices=NETS.keys(), default='vgg16')

args = parser.parse_args()

return args

if name == 'main':
args = parse_args()

prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0],
'faster_rcnn_alt_opt', 'faster_rcnn_test.pt')
caffemodel = os.path.join(cfg.DATA_DIR, 'faster_rcnn_models',
NETS[args.demo_net][1])

if not os.path.isfile(caffemodel):
raise IOError(('{:s} not found.\nDid you run ./data/script/'
'fetch_fast_rcnn_models.sh?').format(caffemodel))

if args.cpu_mode:
caffe.set_mode_cpu()
else:
caffe.set_mode_gpu()
caffe.set_device(args.gpu_id)
net = caffe.Net(prototxt, caffemodel, caffe.TEST)

print '\n\nLoaded network {:s}'.format(caffemodel)

cap = cv2.VideoCapture(0)

while(True):
# Capture frame-by-frame
ret, frame = cap.read()

# Scaling the video feed can help the system run faster (and run on GPUs with less memory)
# e.g. with a standard video stream of size 640x480, a scale_factor = 4 will allow the system to run a < 1 sec/frame
scale_factor = 4
[im2, cls, dets, CONF_THRESH] = demo(net, frame, scale_factor, ('person',))

inds = np.where(dets[:, -1] >= CONF_THRESH)[0]
if len(inds) != 0:
    for i in inds:
        bbox = dets[i, :4]
        cv2.rectangle(frame,(int(bbox[0]*scale_factor),int(bbox[1]*scale_factor)),(int(bbox[2]*scale_factor),int(bbox[3]*scale_factor)),(0,255,0),2)

# Display the resulting frame
cv2.imshow('frame',frame)

if cv2.waitKey(1) & 0xFF == ord('q'):
    break

When everything done, release the capture

cap.release()

cv2.destroyAllWindows()`


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//pull/29#issuecomment-253451063, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ANj66yuHGvu3K-m9f70d477V5h5B7AsSks5qze6GgaJpZM4FAUIy.

@miyamon11
Copy link

thank you for your reply,takemegod.

I think this problem is not file connect problem. I don't use Japanese as folder names at all.

in fact,

cfg.MODELS_DIR = ~/py-faster-rcnn/models/pascal_voc
and
cfg.DATA_DIR = ~/py-faster-rcnn/data

I think that the cause may be in utils/cython_nms.py. I changed this module from utils/cython_nms.py to fast_rcnn/nms_wrapper.py .

and the the error resource is below(~/py-faster-rcnn/caffe-fast-rcnn/python/caffe)

`def _Net_forward(self, blobs=None, start=None, end=None, **kwargs):
"""
Forward pass: prepare inputs and run the net forward.

Parameters
----------
blobs : list of blobs to return in addition to output blobs.
kwargs : Keys are input blob names and values are blob ndarrays.
         For formatting inputs for Caffe, see Net.preprocess().
         If None, input is taken from data layers.
start : optional name of layer at which to begin the forward pass
end : optional name of layer at which to finish the forward pass
      (inclusive)

Returns
-------
outs : {blob name: blob ndarray} dict.
"""
if blobs is None:
    blobs = []

if start is not None:
    start_ind = list(self._layer_names).index(start)
else:
    start_ind = 0

if end is not None:
    end_ind = list(self._layer_names).index(end)
    outputs = set([end] + blobs)
else:
    end_ind = len(self.layers) - 1
    outputs = set(self.outputs + blobs)

if kwargs:
    if set(kwargs.keys()) != set(self.inputs):
        raise Exception('Input blob arguments do not match net inputs.')
    # Set input according to defined shapes and make arrays single and
    # C-contiguous as Caffe expects.
    for in_, blob in kwargs.iteritems():
        if blob.shape[0] != self.blobs[in_].num:
            raise Exception('Input is not batch sized')
        self.blobs[in_].data[...] = blob

self._forward(start_ind, end_ind)

# Unpack blobs to extract
return {out: self.blobs[out].data for out in outputs}

`
This _Net_inputs function is called from im_detect() function in test.py.
im_detect() function is below,

`def im_detect(net, im, boxes=None):
"""Detect object classes in an image given object proposals.

Arguments:
    net (caffe.Net): Fast R-CNN network to use
    im (ndarray): color image to test (in BGR order)
    boxes (ndarray): R x 4 array of object proposals or None (for RPN)

Returns:
    scores (ndarray): R x K array of object class scores (K includes
        background as object category 0)
    boxes (ndarray): R x (4*K) array of predicted bounding boxes
"""
blobs, im_scales = _get_blobs(im, boxes)

# When mapping from image ROIs to feature map ROIs, there's some aliasing
# (some distinct image ROIs get mapped to the same feature ROI).
# Here, we identify duplicate feature ROIs, so we only compute features
# on the unique subset.
if cfg.DEDUP_BOXES > 0 and not cfg.TEST.HAS_RPN:
    v = np.array([1, 1e3, 1e6, 1e9, 1e12])
    hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)
    _, index, inv_index = np.unique(hashes, return_index=True,
                                    return_inverse=True)
    blobs['rois'] = blobs['rois'][index, :]
    boxes = boxes[index, :]

if cfg.TEST.HAS_RPN:
    im_blob = blobs['data']
    blobs['im_info'] = np.array(
        [[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],
        dtype=np.float32)

# reshape network inputs
net.blobs['data'].reshape(*(blobs['data'].shape))
if cfg.TEST.HAS_RPN:
    net.blobs['im_info'].reshape(*(blobs['im_info'].shape))
else:
    net.blobs['rois'].reshape(*(blobs['rois'].shape))

# do forward
forward_kwargs = {'data': blobs['data'].astype(np.float32, copy=False)}
if cfg.TEST.HAS_RPN:
    forward_kwargs['im_info'] = blobs['im_info'].astype(np.float32, copy=False)
else:
    forward_kwargs['rois'] = blobs['rois'].astype(np.float32, copy=False)
blobs_out = net.forward(**forward_kwargs)

if cfg.TEST.HAS_RPN:
    assert len(im_scales) == 1, "Only single-image batch implemented"
    rois = net.blobs['rois'].data.copy()
    # unscale back to raw image space
    boxes = rois[:, 1:5] / im_scales[0]

if cfg.TEST.SVM:
    # use the raw scores before softmax under the assumption they
    # were trained as linear SVMs
    scores = net.blobs['cls_score'].data
else:
    # use softmax estimated probabilities
    scores = blobs_out['cls_prob']

if cfg.TEST.BBOX_REG:
    # Apply bounding-box regression deltas
    box_deltas = blobs_out['bbox_pred']
    pred_boxes = bbox_transform_inv(boxes, box_deltas)
    pred_boxes = clip_boxes(pred_boxes, im.shape)
else:
    # Simply repeat the boxes, once for each class
    pred_boxes = np.tile(boxes, (1, scores.shape[1]))

if cfg.DEDUP_BOXES > 0 and not cfg.TEST.HAS_RPN:
    # Map scores and predictions back to the original set of boxes
    scores = scores[inv_index, :]
    pred_boxes = pred_boxes[inv_index, :]

return scores, pred_boxes

`

What should I do?
I'll appreciate for all tips, Thanks!

@cuatristapr
Copy link

@tsaiJN Maybe your memory issue has to deal with the libraries. Check that all the dependencies are working good, and that you use the openCV designed for the Jetson (it has a different implementation for the CUDA framework).

@cuatristapr
Copy link

@kesonyk try to use the COCO, it is a small net. Also, depends on what you want to train for. If you want for the ImageNET or PASCAL, it will be different for each. You can also train a fast-rcnn model on a imageset, which you can create and train that model on. That way, it will be smaller and the Jetson will run it better. NOTE: you are not looking to train for a large dataset, you want something small and concise, that way it runs well on the Jetson.

@jiangyuguang
Copy link

@miyamon11 i meet same error , do u have solved the problem?
error :Traceback (most recent call last): File "realtime2.py", line 150, in [im2, cls, dets, CONF_THRESH] = demo(net, frame, scale_factor, ('person',)) File "realtime2.py", line 87, in demo scores, boxes = im_detect(net, im2, obj_proposals) File "/home/keisan/py-faster-rcnn/tools/../lib/fast_rcnn/test.py", line 154, in im_detect blobs_out = net.forward(**forward_kwargs) File "/home/keisan/py-faster-rcnn/tools/../caffe-fast-rcnn/python/caffe/pycaffe.py", line 97, in _Net_forward raise Exception('Input blob arguments do not match net inputs.') Exception: Input blob arguments do not match net inputs.

@Isaamarod
Copy link

Hi, I have execute the ./tools/webacm.py, the cam turn on and select the objects but no classify them.

Do you know what could be?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants