墨西哥城给我留下的第一印象其实并不好,可能是因为还沉浸在瓜纳华托的色彩中吧,墨西哥城的夜晚似乎有些过于冷清了。不过这短暂的孤独感很快就被晚餐冲散了。如果我没记错的话,大概等了半小时才排到那家网红店(原谅我不记得店名),又点了甜布丁和加了不少辣酱的玉米粥。餐厅电视播放着南美足球赛,虽然水平不高,但大家似乎都很爱看。

吃完晚饭就开始沿街闲逛,墙上贴了好多《Parasite》的海报,和坎昆街头2012年电影《The Great Gatsby》海报相比显得与时俱进了不少。除此之外,还意外发现了小丑涂鸦,瞬间对这座城市有了更多期待。

印象中走了不少街道,还聊起了在上海骑车的小故事,于是萌生了墨西哥城骑行的念头,可惜某位当时骑车还不够熟练,也就作罢了。回住处的路上,车里广播播放了Dire Straits的《Sultans of Swing》,心情又好了许多。

墨西哥城主要参观了宪法广场和人类学博物馆。本来想看看科幻风格的图书馆,可惜那段时间闭馆,所以只能沿着街道走去人类学博物馆,看看墨西哥城的居民区和墙上女艺术家的涂鸦画像。在博物馆可谓大开眼界,我在大三上了人类学课,课上提到的遗址遗迹和考古发现在这里都能找到,都能看到实物。我简单当了回讲解员,感觉良好。

在宪法广场我们先是登上了拉丁美洲第一高楼,虽然只有两百多米高,但因为历史原因,墨西哥城有太多教堂导致无法规划建设高楼,所以在观景层能将整座城市尽收眼底。宪法广场另外三座主要建筑是大教堂、市政厅和邮局。邮局真是金碧辉煌,让我们以为误入了市政厅。

墨西哥城的美食体验很不错。路边有椰子可以喝,还有鲜榨果汁店,一升木瓜汁只要不到两美元。还有taco路边摊,量大肉多,吃的时候还有吟游诗人表演节目。店家不会英语,但非常热情地给我们介绍表演内容,我们全靠谷歌翻译,算是给翻译软件做了回压力测试。店家还给我们推荐了买纪念品和银饰的地方,我们最终买银饰花光了所有现金,导致没买明信片,最后还是回纽约在时代广场买了几张寄给朋友。

最后一晚是跨年夜,宪法广场似乎出了些骚乱,我们便匆匆回了住处,数着倒计时跨了年。在回美国准备开始新学期前,我们聊了些心里话。

去瓜纳华托这座距离墨西哥城五个多小时车程的小镇,算是我的一份坚持吧。在看到照片的第一眼就被深深迷住了。

不过故事要从在墨西哥城下飞机开始说起。我们取完行李准备打车去汽车站,想搭下午那班汽车前往瓜纳华托。大概是因为赶飞机起得太早,在飞机上靠着也没睡好,坐上Uber后琛哥发出了“要是能直接打车去就好了”的感叹。

接下来发生的事完全超出了我的理解范围:琛哥试探地问司机愿不愿意载我们去瓜纳华托,司机认真考虑了一会儿,向家里的女主人汇报情况后竟然真的同意了,还客串了一回导游,带我们去了当地人喜欢的快餐连锁店吃了顿美食。于是,瓜纳华托,这段奇妙的旅程就这样开始了。

一路欣赏着沿途风景,我们在晚餐前抵达了目的地。住处在半山腰,可能还要再往上一些,爬上楼顶平台可以鸟瞰大半个城镇。太阳尚未落山,灯火未亮,小巷弄和五彩的砖石房子,确实是想象中的模样,心情无比舒畅。

在我眼里,瓜纳华托的一切都好像在等待着夜晚的降临。傍晚时分的瓜纳华托是恬静的,路上没什么行人,让我想起过去在老家生火做晚饭的情景。下山路上,遇到一位老人用法语和我们打招呼,还有三两孩童和小狗作伴。

夜幕降临,华灯初上,瓜纳华托也就热闹起来。到了山谷的城镇中心,市民和游客在集会游行庆祝亡灵节,有乐手扛着传统乐器和大提琴带领着整个队伍唱歌,我们也跟着队伍走到山谷另一边的观景平台。

在观景台上看到的瓜纳华托绚丽浪漫,完全不比电影中的场景逊色。灯光和色彩让我们驻足良久,觉得这五个小时的颠簸完全值得。言语很难描述当时的心情,觉得瓜纳华托需要配上爱情才完整。

回住处的路上,碰巧还有一家面包店没有关门,闻着香味就进去了。陈设很简单,家庭小作坊的模样,里间是烤箱,外间是货架。面包还有刚出炉的,特别美味。

我们在瓜纳华托只住了一晚。第二天早晨赶了当地的集市,在一个大棚里有各种食物和日用品出售,还能看到可口可乐之类的小广告牌,有种十年前国内小城镇赶集的感觉。中午乘双层大巴回墨西哥城,心有留恋,但无遗憾。

I can’t distinguish between kisses and putting down roots
I can’t distinguish between the complex and the simple
and you are on my list of promises to forget
everything burns if you use the right spark

下飞机的第一感受便是温暖,相比于纽约和安娜堡的天气,坎昆简直太舒适了,瞬间心情舒畅。乘坐小巴士从机场前往住处,一路沿着海岸穿过酒店区,恰逢日落时分,晚霞格外美丽。在安娜堡看了太多雪景,很期待沙滩和大海的拥抱。

作为游客,第一件事便是兑换货币,然而在墨西哥的这一周,美元对比索的汇率始终是个谜。第一晚我们找了三家OXXO连锁便利店兑换,三家的汇率竟然各不相同,甚至还目睹了店主临时更改汇率,不多不少就比同行优惠那么一点点。

第一顿晚餐吃得非常愉快。预约时服务生不小心跳过了我们,为了表达歉意特意将我们安排在乐队表演区,于是体验了一次点歌服务。一开始还有些不好意思,毕竟三个对墨西哥歌曲一无所知的异乡人被乐队热情地围着,一时有些不知所措,直到起鹏想起了一首歌。当音乐响起,大家都放松了下来,慢慢融入了氛围中。哪天有空准备将这段演唱的视频剪辑一下分享出来,还挺有韵味的。

我们听从琛哥的推荐,点了不少特色菜肴,名字已经完全记不得了,只记得特别美味,还有特别好喝的大杯柠檬汁。那天恰好是平安夜,服务生都戴着圣诞帽,还有装饰的彩灯彩旗,感觉很美好。

虽然AT&T声称在墨西哥有电信服务,但实际网速令人无语。为了不影响第二天的行程,去了路边一家小超市买了张电话卡,价格相当便宜。我们顺便也买了些早餐,超市里的食品种类很丰富,能想到的基本都有,还有不少当地的酒,酒瓶上印着海龟图案,热带风情立刻显现。

第二天报团前往奇琴伊察,也就是玛雅金字塔所在地。路上观看了关于玛雅文化的电影,血腥场面都没有删减,比书本更直观地了解了玛雅的一些故事,也就少了几分神秘感。到了遗址,亲眼所见并没有让我失望,确实震撼。天气晴朗,墙面和金字塔上图案的细节能看得很清楚,还有雄鹰翱翔,英雄主义的悲壮感也被渲染出来。

在坎昆更多是水上项目,我们体验了桨板和浮潜,还有水上乐园Xplor。先说桓板,是站在板上拿桨划水的那种。一开始上板比较怕落水,反而有些不稳,逐渐适应平衡后很快就能站起来了。我们沿着河道跟着法国教练,习惯之后慢慢开始加速,也开始期待谁先失足落水。琛哥没让我们失望,落水画面恰好被GoPro记录了下来。

浮潜是我最期待的项目。我们先在小池子里适应呼吸器和氧气瓶的重量。水没有那么清澈,有种压迫感让人在心理上难以呼吸。教练告诉我们几个重要手势和指令,便于水下交流。乘小艇到浮潜点,顺着绳索下潜到海床,确实能感受到洋流的力量,如果没抓住绳子,新手很容易被冲走。我的潜水镜密封不太好,被迫上浮与教练互换了一下。在水下我们很幸运地看到了海龟,还尝试追逐一番,也让珊瑚礁里的小鱼啄了手指,很特殊的体验。

Xplor是我们在坎昆最后一天去的地方。我们本来打算搭乘水上乐园接驳车,但因为前两天玩得实在太累,三个人全部睡过了头,最后只能打Uber。谁知因祸得福,比班车更早到了乐园,免去了排队之苦。在乐园玩得最开心的是滑索,记得六七个索道全部玩了两遍。另外就是开越野车,无证驾驶很刺激。回程路上琛哥进行了西班牙语数字教学,成果卓著,到了第二天还能勉强全部记住。

坎昆的最后一晚去了当地的夜市,各种小吃零食种类挺多,但最令人惊喜的还是坎昆的街头艺人和西语说唱。一身黑色摇滚行头,拉着音响,唱着似乎是柔情的曲子,与环境有些格格不入。后来心血来潮查了一下,是Héroes del Silencio的La chispa adecuada。

说唱的两个少年大概和我们同龄,很有活力地绕着桌子边走边唱,有时还会尝试与路人互动。看到我在拍视频,还对着镜头打了个招呼。新鲜事物与这座有着厚重历史和独特文化风俗的城市产生了特别的共鸣。

Autonomous Robot for Target Detection and Pursuit

Overview

AuTom and Jerry is an autonomous robotic system inspired by the classic “Push-Button Kitty” episode of Tom and Jerry. The project demonstrates an intelligent mobile robot capable of detecting, tracking, and approaching moving targets in unknown environments while avoiding obstacles.

Key Capabilities:

  • Real-time target detection and tracking
  • Autonomous navigation with obstacle avoidance
  • Dynamic path planning in unknown environments
  • Robust state management for various operational scenarios

System Architecture

Hardware Components

The system is built on a standard MBot platform with the following components:

Core Components:

  • Raspberry Pi (main processing unit)
  • BeagleBone (auxiliary processing)
  • LiDAR sensor (environmental mapping)
  • Two-wheeled differential drive system
  • On-board camera system

Enhanced Components:

  • Two additional 720p cameras with 100° field of view
  • Replaced standard PiCam to increase detection range and coverage

Software Framework

The system integrates several key technologies:

Computer Vision:

  • AprilTag detection for target identification
  • Real-time pose estimation
  • Multi-camera sensor fusion

Navigation & Planning:

  • Simultaneous Localization and Mapping (SLAM)
  • A* pathfinding algorithm
  • PID control for precise motion

State Management:

  • Finite State Machine (FSM) for behavioral control
  • Robust handling of target loss and recovery

Technical Implementation

Target Detection and Tracking

The vision system uses AprilTags as fiducial markers for reliable target identification. The multi-camera setup provides:

  • Wide-angle coverage: 100° field of view per camera
  • Real-time pose calculation: 6-DOF target positioning
  • Coordinate transformation: Camera frame to SLAM coordinate frame
  • Robust tracking: Handles partial occlusion and varying lighting

Motion Planning and Control

Path Planning:

  • Utilizes A* algorithm for optimal path generation
  • Integrates SLAM-generated occupancy grid for obstacle awareness
  • Considers dynamic target movement in planning decisions

Control System:

  • Fine-tuned PID controller for smooth motion execution
  • Differential drive control for precise maneuvering
  • Real-time velocity and heading adjustments

Behavioral State Machine

The FSM manages robot behavior across different operational scenarios:

Primary States:

  1. Search State

    • Systematic environment scanning
    • Target detection within camera field of view
    • Transition to Follow state upon target acquisition
  2. Follow State

    • Continuous target tracking
    • Dynamic path replanning
    • Collision avoidance integration
  3. Recovery State

    • Activated when target leaves field of view
    • Predictive turning based on target’s last known trajectory
    • View expansion maneuvers to reacquire target

State Transitions:

  • Target detected: Search → Follow
  • Target lost: Follow → Recovery
  • Target reacquired: Recovery → Follow

Results and Performance

The system successfully demonstrates autonomous target pursuit with the following characteristics:

  • Reliable detection: Consistent AprilTag recognition across varying conditions
  • Smooth navigation: Collision-free movement in cluttered environments
  • Adaptive behavior: Robust recovery from target loss scenarios
  • Real-time performance: Low-latency response to dynamic target movement

Technical Challenges and Solutions

Challenge 1: Limited Field of View

  • Solution: Multi-camera setup with wide-angle lenses

Challenge 2: Real-time Path Planning

  • Solution: Efficient A* implementation with SLAM integration

Challenge 3: Target Loss Recovery

  • Solution: Predictive FSM with intelligent search patterns

Future Enhancements

Potential improvements for the system include:

  • Advanced Prediction: Machine learning for target trajectory prediction
  • Multi-target Tracking: Simultaneous pursuit of multiple targets
  • Enhanced Sensors: Integration of additional sensor modalities
  • Collaborative Robotics: Multi-robot coordination capabilities

Demo

This project demonstrates the integration of computer vision, autonomous navigation, and intelligent control systems in a practical robotic application.

Implementation of Algorithms Designed by Yiding Ji and Stéphane Lafortune

Abstract

The paper investigates quantitative supervisory control with a local mean payoff objective for weighted discrete event system. The supervisor is designed to ensure that the mean payoff of weights over a fixed number of transitions never drops below a given threshold, aka the system stability. The algorithms transfer the supervisory control problem to a biparty game between the supervisor and the environment.

Automaton Model

The weighted discrete event system can be modeled as an automaton. Here shows an example. The state with “!” is unsafe state. The set of controllable events is {a,b,c,d,e,f}, and the set of uncontrollable events is {u1,u2,u3,u4,u5}. The localhost website gui is supported by the WebMachine from transitions_gui package.

1
2
3
4
5
6
7
8
9
10
11
12
def visualizeMachine(states, transitions, initial, name, ordered_transitions=False, ignore_invalid_triggers=True,
auto_transitions=False):
machine = WebMachine(states=states, transitions=transitions, initial=initial, name=name,
ordered_transitions=ordered_transitions,
ignore_invalid_triggers=ignore_invalid_triggers,
auto_transitions=auto_transitions)
try:
while True:
time.sleep(5)
except KeyboardInterrupt:
machine.stop_server()

Algorithm 1

The first algorithm is to transform the automaton model to a biparty game. It has two steps: insert transition states into the original automaton, and remove deadlocking or unsafe states. We can use DFS to find out all transitions, and generally the inserted transition states of one original state are the combination of all controllable events at that state plus the uncontrollable ones.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def DoDFS(y, X, x0, f, Ec, Euc, w, Qy, Qz, G, fyz, fzy):
gamma = []
tEuc = []
tEc = []
for t in f:
if t[1] == y:
if t[0] in Euc:
tEuc.append(t[0])
else:
tEc.append(t[0])
if len(tEuc) == 0 and len(tEc) == 0:
return
if len(tEuc) != 0 and len(tEc) == 0:
gamma.append(tEuc)
G.append(tEuc)
else:
if len(tEuc) != 0:
gamma.append(tEuc)
G.append(tEuc)
for i in range(1, len(tEc) + 1):
l = list(combinations(tEc, i))
for r in l:
gamma.append(list(r) + tEuc)
G.append(list(r) + tEuc)
for g in gamma:
z = [y] + g
fyz.append([g, y, z])
if z not in Qz:
Qz.append(z)
for e in g:
yn = ''
for t in f:
if t[0] == e and t[1] == y:
yn = t[2]
break
if yn[0] == '!':
continue
fzy.append([e, z, yn])
if yn not in Qy:
Qy.append(yn)
DoDFS(yn, X, x0, f, Ec, Euc, w, Qy, Qz, G, fyz, fzy)


def removeUnsafe(states, transitions):
for s in states:
if s[0] == '!':
for t in transitions:
if t[1] == s or t[2] == s:
transitions.remove(t)
states.remove(s)

Algorithm 2

The second algorithm is to find the supervisor’s winning region. It first calculates the window mean payoff function which tracks the best worst-case accumulative weights that the supervisor may achieve from a state within N event occurrences (window size). Then the goal equivalents to adopting a control strategy to obtain a nonnegative accumulative payoff while the environment aims to spoil that goal by enforcing negative payoffs. We can use divide-and-conquer to find out all stable windows where the local window mean payoffs are all nonnegative. The example has a window size of three.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def StableWindow(Qy, Qz, fyz, fzy, w, N):
wgr = []
h = {}
h[0] = {}
for q in Qy + Qz:
h[0][str(q)] = 0
for i in range(1, N + 1):
h[i] = {}
for q in Qz:
ey = [[element[0], element[2]]
for element in fzy if element[1] == q and element[2] in Qy]
if len(ey) != 0:
ystates = []
for y in ey:
if str(y[1]) in h[i - 1].keys():
ystates.append(y)
if len(ystates) != 0:
h[i][str(q)] = min([w[t[0]] + h[i - 1][str(t[1])]
for t in ystates])
if h[i][str(q)] >= 0:
wgr.append(q)
for q in Qy:
ytz = [element[2]
for element in fyz if element[1] == q and element[2] in Qz]
if len(ytz) != 0:
zstates = []
for z in ytz:
if str(z) in h[i].keys():
zstates.append(z)
if len(zstates) != 0:
h[i][str(q)] = max([h[i][str(z)] for z in zstates])
if h[i][str(q)] >= 0:
wgr.append(q)
wg = []
[wg.append(q) for q in wgr if not q in wg]
return wg


def WinLocal(Qy, Qz, fyz, fzy, w, N):
wg = StableWindow(Qy, Qz, fyz, fzy, w, N)
wp = []
if len(wg) == len(Qy + Qz) or len(wg) == 0:
wp = wg
else:
Qyn = []
for y in Qy:
if y in wg:
Qyn.append(y)
Qzn = []
for z in Qz:
if z in wg:
Qzn.append(z)
wp = WinLocal(Qyn, Qzn, fyz, fzy, w, N)
return wp


def attraction(attr, s0, Q, f, wp):
if s0 in wp:
return True
else:
res = []
for i in f:
if i[1] == s0:
res.append(attraction(attr, i[2], Q, f, wp))
for i in res:
if i == False:
return False
attr.append(s0)
return True


def winRegion(Qy, Qz, fyz, fzy, w, N, y0, wl):
ws = []
n = 1
Qyw = [y for y in Qy]
Qzw = [z for z in Qz]
while len(ws) != len(Qy + Qz):
wp = WinLocal(Qyw, Qzw, fyz, fzy, w, N)
wl += wp
if len(wp) == 0:
break
attrr = []
_ = attraction(attrr, y0, Qy + Qz, fzy + fyz, wp)
for p in wp + attrr:
if p not in ws:
ws.append(p)
for y in Qyw:
if y in ws:
Qyw.remove(y)
for z in Qzw:
if z in ws:
Qzw.remove(z)
n += 1
return ws

Algorithm 3

The third algorithm is to reconstruct and unfold the result of the second algorithm. It has two steps. The first one is to merge all stable windows in the second algorithm together, and add original states back to the automaton so that each original state has only one transit-out arrow. The second one is to remove all the inserted environment states to simplify the automaton and finally give a stable workflow.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
def Unfold(y0u, Qyu, Qzu, fyzu, fzyu, Qy, Qz, fyz, fzy):
duplicate = {}
lru = {}
for i in Qy:
duplicate[i] = 1
lru[i] = 0
state = y0u
while len(Qzu) < len(Qz):
du = False
sn = ''
if state in Qyu:
if duplicate[state] != 1:
du = True
sn = '{' + state + '}' + str(lru[state])
if lru[state] != 0:
Qyu.append(sn)
else:
Qyu.append(state)
d = 0
for i in fyz:
if i[1] == state:
d += 1
duplicate[state] = d
lru[state] = (lru[state] + 1) % duplicate[state]
temp = []
for i in fyz:
if i[1] == state and i[2] not in Qzu:
temp.append(i)
z = ''
if len(temp) == 1:
z = temp[0][2]
Qzu.append(z)
if du == False:
fyzu.append(temp[0])
else:
fyzu.append([temp[0][0], sn, z])
else:
t = [len(i[2]) for i in temp]
yz = temp[t.index(max(t))]
z = yz[2]
Qzu.append(z)
if du == False:
fyzu.append(yz)
else:
fyzu.append([yz[0], sn, z])
y = []
for i in fzy:
if i[1] == z:
if lru[i[2]] == 0:
fzyu.append(i)
else:
fzyu.append([i[0], i[1], '{' + i[2] + '}' + str(lru[i[2]])])
y.append(i[2])
if len(y) == 1:
state = y[0]
else:
yin = []
for i in y:
if i not in Qyu:
yin.append(i)
num = []
if len(yin) == 0:
for i in y:
count = 0
for j in fzy:
if j[2] == i:
count += 1
num.append(count)
state = y[num.index(min(num))]
else:
for i in yin:
count = 0
for j in fzy:
if j[2] == i:
count += 1
num.append(count)
state = y[num.index(min(num))]

0%