-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.xml
315 lines (214 loc) · 36.4 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="https://www.w3.org/2005/Atom">
<channel>
<title>Objective Funk on Objective Funk</title>
<link>https://nsaphra.github.io/</link>
<description>Recent content in Objective Funk on Objective Funk</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<copyright>&copy; 2018</copyright>
<lastBuildDate>Wed, 20 Apr 2016 00:00:00 -0700</lastBuildDate>
<atom:link href="/" rel="self" type="application/rss+xml" />
<item>
<title>Sources of Variance in Pretraining and Finetuning</title>
<link>https://nsaphra.github.io/talk/ucirvine/</link>
<pubDate>Mon, 20 Jun 2022 13:00:00 -0700</pubDate>
<guid>https://nsaphra.github.io/talk/ucirvine/</guid>
<description></description>
</item>
<item>
<title>Interpretability Creationism</title>
<link>https://nsaphra.github.io/post/creationism/</link>
<pubDate>Tue, 07 Jun 2022 00:00:00 -0700</pubDate>
<guid>https://nsaphra.github.io/post/creationism/</guid>
<description>
<p>For centuries, Europeans agreed that the presence of a cuckoo egg was a great honor to a nesting bird, as it granted an opportunity to exhibit Christian hospitality. The devout bird enthusiastically fed her holy guest, even more so than she would her own (evicted) chicks <a href="https://app.thestorygraph.com/books/37ed3b62-8a3a-448b-9e37-cd5e5f51c640" target="_blank">(Davies, 2015)</a>. In 1859, Charles Darwin’s studies of another occasional brood parasite, finches, called into question any rosy, cooperative view of bird behavior <a href="https://app.thestorygraph.com/books/44185106-8198-42ef-bacf-8a9bf691e654" target="_blank">(Darwin, 1859)</a>. Without considering the evolution of the cuckoo’s role, it would have been difficult to recognize the nesting bird not as a gracious host to the cuckoo chick, but as an unfortunate dupe. The historical process is essential to understanding its biological consequences; as evolutionary biologist Theodosius Dobzhansky put it, <a href="https://en.wikipedia.org/wiki/Nothing_in_Biology_Makes_Sense_Except_in_the_Light_of_Evolution#cite_note-Dobz_Nothing-1" target="_blank">Nothing in Biology Makes Sense Except in the Light of Evolution</a>.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/5/5c/Reed_warbler_cuckoo.jpg" alt="By Per Harald Olsen - Own work, CC BY-SA 3.0" width="200"/></p>
<p>Certainly SGD is not literally biological evolution, but post-hoc analysis in machine learning <a href="https://twitter.com/ch402/status/1533164918886703104" target="_blank">has a lot in common</a> with scientific approaches in biology, and likewise often requires an understanding of the origin of model behavior. Therefore, the following holds whether looking at parasitic brooding behavior or at the inner representations of a neural network: if we do not consider how a system develops, it is difficult to distinguish a pleasing story from a useful analysis.</p>
<h2 id="just-so-stories">Just-So Stories</h2>
<p>We have many pleasing <a href="https://en.wikipedia.org/wiki/Just_So_Stories" target="_blank">just-so stories</a> in NLP. Much has been made of interpretable artifacts such as <a href="https://aclanthology.org/2022.acl-long.269.pdf" target="_blank">syntactic attention distributions</a> or <a href="https://openai.com/blog/unsupervised-sentiment-neuron/" target="_blank">selective neurons</a>. But how can we know if such a pattern of behavior is actually used by the model?
Causal modeling can help, but interventions to test the influence of particular features and patterns may target only particular types of behavior explicitly. In practice, it may be possible only to perform certain types of slight interventions on specific units within a representation, failing to reflect interactions between features properly. Furthermore, in staging these interventions, we create distribution shifts that a model may not be robust to, regardless of whether that behavior is part of a core strategy. Significant distribution shifts can cause erratic behavior, so why shouldn&rsquo;t they cause spurious interpretable artifacts? In practice, we find <a href="https://arxiv.org/pdf/2010.12016.pdf" target="_blank">no shortage</a> of incidental observations construed as crucial.</p>
<p>Fortunately, the study of evolution has provided a number of ways to interpret the artifacts produced by a model. They might be vestigial, like a human tailbone. They may have dependencies, with some features and structures relying on the presence of other properties earlier in training, like the requirement for light sensing before a complex eye can develop. Some artifacts might represent side effects of training, like how junk DNA constitutes a majority of our genetic code without influencing our phenotypes.</p>
<p>We have a number of theories for how such unused artifacts might emerge while training models. For example, the <a href="https://arxiv.org/abs/1703.00810" target="_blank">Information Bottleneck Hypothesis</a> predicts how inputs may be memorized early in training, before representations are compressed to only retain information about the output. These early memorized interpolations may not ultimately be useful when generalizing to unseen data, but they are essential in order to eventually learn to specifically represent the output. We also can infer the possibility of vestigial features, because early training behavior is so distinct from late training: <a href="https://arxiv.org/abs/1905.11604" target="_blank">earlier models are more simplistic</a>. In the case of language models, they <a href="https://arxiv.org/abs/2109.06096" target="_blank">behave similarly to ngram models</a> early on and <a href="https://www.aclweb.org/anthology/2020.emnlp-main.16" target="_blank">exhibit linguistic patterns</a> later. Side effects of such a heteroskedastic training process could easily be mistaken for crucial components of a trained model.</p>
<h2 id="the-evolutionary-view">The Evolutionary View</h2>
<p>I may be unimpressed by &ldquo;interpretability creationist&rdquo; explanations of static fully trained models, but I have engaged in similar analysis myself. I&rsquo;ve published papers on <a href="https://arxiv.org/pdf/2010.02180.pdf" target="_blank">probing static representations</a>, and the results often seem intuitive and explanatory. However, the presence of a feature at the end of training is hardly informative about the inductive bias of a model on its own! Consider <a href="https://openreview.net/forum?id=mNtmhaDkAr" target="_blank">Lovering et al.</a>, who found that the ease of extracting a feature at the start of training, along with an analysis of the finetuning data, has deeper implications for finetuned performance than we get by simply probing at the end of training.</p>
<p>Let us consider an explanation usually based on analyzing static models: hierarchical behavior in language models. An example of this approach is the claim that <a href="https://nlp.stanford.edu/pubs/hewitt2019structural.pdf" target="_blank">words that are closely linked on a syntax tree have representations that are closer together</a>, compared to words that are syntactically farther. How can we know that the model is behaving hierarchically by grouping words according to syntactic proximity? Alternatively, syntactic neighbors may be more strongly linked due to a strong correlation between nearby words because they have higher joint frequency distributions. For example, perhaps constituents like &ldquo;football match&rdquo; are more predictable due to the frequency of their co-occurrence, compared to more distant relations like that between &ldquo;uncle&rdquo; and &ldquo;football&rdquo; in the sentence, &ldquo;My uncle drove me to a football match&rdquo;. In fact, we can be more confident that some language models are hierarchical, because early models encode more local information in <a href="https://arxiv.org/abs/1811.00225" target="_blank">LSTMs</a> and <a href="https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html#argument-phase-change" target="_blank">Transformers</a>, and they learn longer distance dependencies more easily when those dependencies can be <a href="https://arxiv.org/abs/2010.04650" target="_blank">stacked onto short familiar constituents</a> hierarchically.</p>
<h2 id="an-example">An Example</h2>
<p>I recently had to manage the trap of interpretability creationism myself. My coauthors had found that, when training text classifiers repeatedly with different random seeds, <a href="https://arxiv.org/abs/2205.12411" target="_blank">models can occur in a number of distinct clusters</a>. Further, we could predict the generalization behavior of a model based on which other models it was connected to on the loss surface. Now, we suspected that different finetuning runs found models with different generalization behavior because their trajectories entered different basins on the loss surface.</p>
<p>But could we actually make this claim? What if one cluster actually corresponded to earlier stages of a model? Eventually those models would leave for the cluster with better generalization, so our only real result would be that some finetuning runs were slower than others. We had to demonstrate that training trajectories could actually become trapped in a basin, providing an explanation for the diversity of generalization behavior in trained models. Indeed, when we looked at several checkpoints, we confirmed that models that were very central to either cluster would become <em>even more</em> strongly connected to the rest of their cluster over the course of training. Instead of offering a just-so story based on a static model, we explored the evolution of observed behavior to confirm our hypothesis.</p>
<p><img src="https://nsaphra.github.io/img/clusters.png" alt="k" /></p>
<h2 id="a-proposal">A Proposal</h2>
<p>To be clear, not every question can be answered by <em>only</em> observing the training process. Causal claims require interventions! In biology, for example, research about antibiotic resistance requires us to deliberately expose bacteria to antibiotics, rather than waiting and hoping to find a natural experiment. Even the claims currently being made based on observations of training dynamics may require experimental confirmation.</p>
<p>Furthermore, not all claims require <em>any</em> observation of the training process. Even to ancient humans, many organs had obvious purpose: eyes see, hearts pump blood, and <a href="https://www.scientificamerican.com/article/aristotle-thought-the-brain-was-a-radiator/" target="_blank">brains are refrigerators</a>. Likewise in NLP, just by analyzing static models we can make simple claims: that particular neurons activate in the presence of particular properties, or that some types of information remain accessible within a model. However, the training dimension can still clarify the meaning of many observations made in a static model.</p>
<p>My proposal is simple. Are you developing a method of interpretation or analyzing some property of a trained model? Don&rsquo;t just look at final checkpoint in training. Apply that analysis to several intermediate checkpoints. If you are finetuning a model, check several points both early and late in training. If you are analyzing a large language model, <a href="https://arxiv.org/abs/2106.16163" target="_blank">MultiBERTs</a> and <a href="https://nlp.stanford.edu/mistral/getting_started/download.html" target="_blank">Mistral</a> both provide intermediate checkpoints sampled from throughout training on masked and autoregressive language models, respectively. Does the behavior that you&rsquo;ve analyzed change over the course of training? Does your belief about the model&rsquo;s strategy actually make sense after observing what happens early in training? There&rsquo;s very little overhead to an experiment like this, and you never know what you&rsquo;ll find!</p>
</description>
</item>
<item>
<title>Sources of Variance in Pretraining and Finetuning (Keynote)</title>
<link>https://nsaphra.github.io/talk/quebec/</link>
<pubDate>Wed, 01 Jun 2022 14:30:00 -0700</pubDate>
<guid>https://nsaphra.github.io/talk/quebec/</guid>
<description></description>
</item>
<item>
<title>Linear Connectivity Reveals Generalization Strategies</title>
<link>https://nsaphra.github.io/publication/juneja-linear-2022/</link>
<pubDate>Wed, 01 Jun 2022 00:00:00 +0000</pubDate>
<guid>https://nsaphra.github.io/publication/juneja-linear-2022/</guid>
<description></description>
</item>
<item>
<title>Mathematical Fundamentals of AI</title>
<link>https://nsaphra.github.io/talk/nyu_mlschool/</link>
<pubDate>Sat, 01 Jan 2022 15:00:00 -0800</pubDate>
<guid>https://nsaphra.github.io/talk/nyu_mlschool/</guid>
<description></description>
</item>
<item>
<title>Against Monodomainism</title>
<link>https://nsaphra.github.io/post/monodomainism/</link>
<pubDate>Wed, 28 Apr 2021 00:00:00 -0700</pubDate>
<guid>https://nsaphra.github.io/post/monodomainism/</guid>
<description><p>Reaching the endpoint of a PhD studying how language models learn, I have spent several years telling people that I study &ldquo;machine learning and natural language processing&rdquo;. However, my colleagues who tried to understand or augment image classifiers would describe themselves only as working in &ldquo;machine learning&rdquo;. I argue that this pattern reflects thinking about what it means to be &ldquo;application&rdquo; work or &ldquo;core&rdquo; machine learning that damages our understanding of statistical modeling and deep learning as a whole.</p>
<p>Why do we know so little about how language models learn? This gap is in part because consideration of NLP as a domain is historically rare in venues that publish most training dynamics research, or analytic work in learning theory. A current search<sup class="footnote-ref" id="fnref:1"><a href="#fn:1">1</a></sup> of ICML 2020 publications returned 169 papers with citations to “Association for Computational Linguistics” or “ACL”, even including citations to many potential sister conferences: NAACL, AACL, or EACL. A search for citations to a single vision conference, “Computer Vision and Pattern Recognition” or “CVPR”, turned up 541 papers. In COLT publications since 2017, the same searches turned up 13 and 23 papers, respectively. In ICML 2020, Wikitext-* or PTB references found only 16 results, while the most popular small corpus for image classification, MNIST, found 264 ICML publications<sup class="footnote-ref" id="fnref:2"><a href="#fn:2">2</a></sup>.</p>
<p>Linguistics provides us with the salient concept of <em>markedness</em> <a href="https://www.degruyter.com/document/doi/10.1515/9783110862010.11/html" target="_blank">(Andersen, 1989)</a>. In language, some forms of a word are the default form, while others are explicitly marked by some additional inflection. An example would be contrast between the word “marked”, which is an <em>unmarked</em> form compared to “unmarked”, which is <em>marked</em> by the prefix “un-”. In machine learning, we might call CV an unmarked domain by convention, in contrast to the <em>marked</em> NLP. This convention means that certain tasks and architectures are considered the default environments to understand. Such a convention privileges understanding continuous data over discrete; ConvNets over LSTMs; ResNets over Transformers; geometric tasks over structured prediction.</p>
<p>Understanding one machine learning domain will always extend analysis of others. For example, latent tree structure is inherent to both domains, but in CV, it is obscured by the image data from which we must compose eyes and mouth into a face—and subsequently, body and face into a cow <a href="https://ieeexplore.ieee.org/document/6909858" target="_blank">(Vedaldi et al., 2014)</a>. Image classification is also a language task, because it is our language that provides the intuitions which we use to construct ontologies that turn into image classes; English does not provide us with common distinctions for different packs of wolves, but it names every dog breed, and so the image labels are chosen according to available terminology.</p>
<p>Many researchers think of text data as arcane, but the unmarked domain of CV displays many idiosyncrasies on which to overfit our understanding of statistical modeling. CV provides us with many interesting geometric phenomena, but the underlying structure of language <em>without</em> the added noisy channel of an image can provide a clear and simple domain worth analyzing, as well. A true understanding of statistical models must be a multi-domain understanding, not a mono-domain view focused on one task and its peculiarities.</p>
<div class="footnotes">
<hr />
<ol>
<li id="fn:1">Searches were performed with Google Scholar.
<a class="footnote-return" href="#fnref:1"><sup>^</sup></a></li>
<li id="fn:2">*CL venues have also become distanced from work in computational linguistics <a href="https://www.aclweb.org/anthology/J07-2013.pdf" target="_blank">(Reiter, 2007)</a>, leaving NLP as a field deprived of new scientific work in its data domain as well as new scientific work in its methodologies.
<a class="footnote-return" href="#fnref:2"><sup>^</sup></a></li>
</ol>
</div>
</description>
</item>
<item>
<title>A Non-Linear Structural Probe</title>
<link>https://nsaphra.github.io/publication/white-nonlinear-2021/</link>
<pubDate>Fri, 01 Jan 2021 00:00:00 +0000</pubDate>
<guid>https://nsaphra.github.io/publication/white-nonlinear-2021/</guid>
<description></description>
</item>
<item>
<title>The MultiBERTs: BERT Reproductions for Robustness Analysis</title>
<link>https://nsaphra.github.io/publication/sellam-multiberts-2021/</link>
<pubDate>Fri, 01 Jan 2021 00:00:00 +0000</pubDate>
<guid>https://nsaphra.github.io/publication/sellam-multiberts-2021/</guid>
<description></description>
</item>
<item>
<title>Accessible Means Hackable (Keynote)</title>
<link>https://nsaphra.github.io/talk/pydata/</link>
<pubDate>Sat, 15 Aug 2020 13:00:00 -0700</pubDate>
<guid>https://nsaphra.github.io/talk/pydata/</guid>
<description></description>
</item>
<item>
<title>Understanding Privacy-Related Questions on Stack Overflow</title>
<link>https://nsaphra.github.io/publication/tahaei-understanding-2020/</link>
<pubDate>Wed, 01 Apr 2020 00:00:00 +0000</pubDate>
<guid>https://nsaphra.github.io/publication/tahaei-understanding-2020/</guid>
<description></description>
</item>
<item>
<title>LSTMs Compose (and Learn) Bottom-Up</title>
<link>https://nsaphra.github.io/publication/saphra-lstms-2020/</link>
<pubDate>Wed, 01 Jan 2020 00:00:00 +0000</pubDate>
<guid>https://nsaphra.github.io/publication/saphra-lstms-2020/</guid>
<description></description>
</item>
<item>
<title>Pareto Probing: Trading Off Accuracy for Complexity</title>
<link>https://nsaphra.github.io/publication/pimentel-pareto-2020/</link>
<pubDate>Wed, 01 Jan 2020 00:00:00 +0000</pubDate>
<guid>https://nsaphra.github.io/publication/pimentel-pareto-2020/</guid>
<description></description>
</item>
<item>
<title>What Does a Coder Do If They Can't Type?</title>
<link>https://nsaphra.github.io/post/hands/</link>
<pubDate>Thu, 08 Aug 2019 18:11:42 +0100</pubDate>
<guid>https://nsaphra.github.io/post/hands/</guid>
<description>
<p>In August of 2015, my hands stopped working. I could still control them, but every movement accumulated more pain, so every motion came with a cost: getting dressed in the morning, sending a text, lifting a glass. I was interning at Google that summer about to begin a PhD in Scotland, but coding all day would have left me in agony. In relating this story, I often mention that for months before I learned to work without my hands, I had nothing to do but go to a bar and order a shot of vodka with a straw in it. This is a very funny joke.</p>
<p>I have been in pain for four years.</p>
<hr />
<h2 id="talon">Talon</h2>
<p>Due to this disability, I cannot type or write by hand. Many people have asked me about the stack that enables me to be productive in spite of this limitation. I hope this information is helpful both for people with more severe limitations, and for programmers with mild repetitive stress injuries who can benefit from reducing their keyboard use.</p>
<p>The star of the show is <a href="https://talonvoice.com/" target="_blank">Talon</a>, a system which makes it easy to write customized grammars and scripts that work with speech recognition systems to enable programming. Commands range from simple aliases for common symbols to complex meta-commands which repeat a previous utterance or change dictation modes. For example, just in the case of parentheses, I have separate commands for <code>(</code>, <code>)</code>, <code>()</code>, and <code>()⬅️</code> (which leaves the cursor between parentheses so my next utterance is bracketed).</p>
<p>Each Talon user has a number of personal scripts. The most precious script that I&rsquo;ve written is probably my indexed clipboard:</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-python" data-lang="python"> <span style="color:#f92672">from</span> talon.voice <span style="color:#f92672">import</span> Key, press, Str, Context
<span style="color:#f92672">from</span> talon <span style="color:#f92672">import</span> clip
<span style="color:#f92672">from</span> .talon_community.utils <span style="color:#f92672">import</span> <span style="color:#f92672">*</span>
ctx <span style="color:#f92672">=</span> Context(<span style="color:#e6db74">&#39;clipboard&#39;</span>)
<span style="color:#66d9ef">def</span> <span style="color:#a6e22e">copy_selection</span>(m):
<span style="color:#66d9ef">with</span> clip<span style="color:#f92672">.</span>capture() <span style="color:#66d9ef">as</span> sel:
press(<span style="color:#e6db74">&#39;cmd-c&#39;</span>)
<span style="color:#66d9ef">if</span> len(m<span style="color:#f92672">.</span>_words) <span style="color:#f92672">&gt;</span> <span style="color:#ae81ff">1</span>:
key <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39; &#39;</span><span style="color:#f92672">.</span>join(parse_words(m))
value <span style="color:#f92672">=</span> sel<span style="color:#f92672">.</span>get()
keymap[<span style="color:#e6db74">&#39;paste </span><span style="color:#e6db74">%s</span><span style="color:#e6db74">&#39;</span> <span style="color:#f92672">%</span> key] <span style="color:#f92672">=</span> value
ctx<span style="color:#f92672">.</span>keymap(keymap)
ctx<span style="color:#f92672">.</span>reload()
<span style="color:#66d9ef">else</span>:
clip<span style="color:#f92672">.</span>set(sel<span style="color:#f92672">.</span>get())
keymap <span style="color:#f92672">=</span> {
<span style="color:#e6db74">&#39;paste&#39;</span>: Key(<span style="color:#e6db74">&#39;cmd-v&#39;</span>),
<span style="color:#e6db74">&#39;clip [&lt;dgndictation&gt;]&#39;</span>: copy_selection,
}
ctx<span style="color:#f92672">.</span>keymap(keymap)</code></pre></div>
<p>The use is simple. After selecting a particular phrase using my cursor control commands, I say &ldquo;clip [foo]&ldquo;, and every time I want to enter the same phrase after, I say &ldquo;paste [foo]&ldquo;. I therefore only have to dictate a particularly obnoxious variable name once. However, it does introduce a new challenge: every variable has two names, its written name and its spoken name. This unfortunate side effect exacerbates the difficulty of naming variables, which has been called &ldquo;the hardest problem in computer science&rdquo;.</p>
<p>If you are a vim or Emacs power user, this may all feel familiar to you. I have commands for searching, moving a cursor, selection, and manipulating the clipboard. Learning to dictate code is a lot like learning a new text editor very thoroughly, down to the challenge of customizing for your particular languages and needs.</p>
<p>The <a href="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/dwiel/talon_community" target="_blank">Talon community</a> has specialized commands that take effect depending on application or programming language. For a Perl user, for example, a good starting point might be to borrow settings from Emily Shea:</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
<iframe src="//www.youtube.com/embed/Mz3JeYfBTcY" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" allowfullscreen title="YouTube Video"></iframe>
</div>
<p>My Talon setup relies on Dragon for the speech recognition side. Unfortunately, Nuance has discontinued OSX Dragon editions that make scripting possible. The coder behind Talon, <a href="https://ryanhileman.com/" target="_blank">Ryan Hileman</a>, is working on a suitable replacement but at time of writing, it is not yet ready.</p>
<hr />
<h3 id="interlude">Interlude</h3>
<p>People often ask for my diagnosis, but it officially depends on the country I&rsquo;m in. After an initial assumption that carpal tunnel was to blame, a rheumatologist gave me my first American diagnosis: <em>fibromyalgia</em>, a word which is Doctorspeak for &ldquo;go away&rdquo;.</p>
<p>I did not go away. A neurologist performed a skin biopsy that led to my official American diagnosis of &ldquo;idiopathic small fiber neuropathy&rdquo;, meaning that I am missing crucial nerve fibers that transmit heat and pain but nobody knows why. <em>Idiopathic</em> is also Doctorspeak for &ldquo;go away&rdquo;.</p>
<p>I went away to the UK. I brought my medical records from America, but my British neurologist did not read my records or perform examinations. After a brief conversation, he gave me my British diagnosis by submitting a note that he had no evidence of any physical cause, and he &ldquo;suspected significant functional overlay&rdquo;, which is how they teach you to call someone delusional in medical school.</p>
<p>My GP read the note and informed me: He would not prescribe me painkillers. He would not send me for a second opinion from a neurologist, or treatment from any other specialist. The only referral he would write would be to a psychologist to help me &ldquo;resolve the underlying issues behind my pain&rdquo;.</p>
<p>He then kicked me out of his office for using the word &ldquo;fucking&rdquo;. &ldquo;We do not tolerate cursing&rdquo;, said a sign in the lobby.</p>
<hr />
<h2 id="equipment">Equipment</h2>
<p>For dictating, I use two different microphones. In the office, I use a <a href="https://en-uk.sennheiser.com/me-3-ii" target="_blank">Sennheiser ME-3</a>, while for travel I use a Bluetooth headset, the <a href="https://en-uk.sennheiser.com/mb-pro-1-uc-ml-and-mb-pro-2-uc-ml" target="_blank">Sennheiser MB Pro 2</a>.</p>
<p>Another essential piece of equipment for me is my foot pedal, a <a href="https://www.pageflip.com/products/firefly" target="_blank">PageFlip Firefly</a>. It is programmable, so I have modified the settings to include one that is useful for reading papers in <a href="https://skim-app.sourceforge.io/" target="_blank">Skim</a>, with the left pedal corresponding to a click and the right pedal corresponding to down arrow. I can use my feet to scroll, and to click for annotations. Another pedal setting I have added maps the pedals to click and shift+enter. This setting is useful for Jupyter notebooks and writing my research notes and mathematical scratch work in <a href="https://happenapps.com/" target="_blank">Quiver</a>.</p>
<p>When my hands are unusually aggravated, I cannot nudge my mouse around anymore and I fall back on <a href="https://shortcatapp.com/" target="_blank">shortcat</a>, which allows me to press buttons by dictating keyboard strokes instead of using a mouse.</p>
<p>My final essential piece of equipment is a pair of <a href="https://www.futuro-usa.com/3M/en_US/futuro-us/products/~/FUTURO-Night-Wrist-Support/?N=4318+3294508029+3294529207&amp;rt=rud" target="_blank">large wrist braces</a>. The primary purpose of my braces is to discourage me from habitual hand use. I always wear them at conferences, because wearing them is easier than constantly repeating, &ldquo;I cannot shake hands due to a disability&rdquo;.</p>
<hr />
<h3 id="interlude-1">Interlude</h3>
<p>I struggle with sleep. I dream that my thumbs fall off. I dream that every bone in my hands breaks. I dream that my arms break out in open bleeding sores. I wake up and the pain remains like an invisible nightmare.</p>
<hr />
<h2 id="limitations">Limitations</h2>
<p>Maybe ironically, the largest concern if you begin to dictate code is that you do not develop a repetitive stress injury in your vocal tract. Speaking quietly can actually cause more damage, hydration is important, and better posture will prevent damage in your voice as well as the rest of your body. I strongly recommend finding a vocal coach who teaches actors and singers how to protect their voices. It is important to take breaks, and you may find talking tiring outside of work.</p>
<p>Speech recognition technology is not perfect, and the error rate is even higher if you have an unusual accent. Furthermore, it may force you to take time off from programming every time you develop a cold or sore throat. I live in fear of even minor colds.</p>
<p>Having a private space to dictate in is essential. I was unable to be productive working from home, but as soon as I had a private office I developed momentum on several research projects. I know that this is a huge limitation for a lot of people because of the productivity-destroying, soul-sucking trend towards open offices for all programming work. If your workplace has fallen prey to this trend, you may still have options. In many countries, large companies will be obligated to provide a space to work in if you are disabled.</p>
<hr />
<h3 id="addendum">Addendum</h3>
<p>Life with my disability is not easy, but thanks to <a href="https://en.wikipedia.org/wiki/Hedonic_treadmill" target="_blank">hedonic adaptation</a> as well as satisfying <a href="https://nsaphra.github.io/publication/" target="_blank">work</a> and <a href="https://auldreekierollerderby.com/2019/08/10/the-one-gift-i-received-along-with-my-disability/" target="_blank">hobbies</a>, I am actually very happy. If you have recently developed a disability or chronic pain condition, it may feel like you could never adjust to the lifestyle required. That is why I have tried to give you a lens into my challenges as well as my successes. It is easy to respond to anyone who has overcome adversity with one of two reactions: &ldquo;It can&rsquo;t be that hard,&rdquo; or &ldquo;I could never do that&rdquo;. Move past both reactions. It is that hard. You can do it.</p>
<p>If you are currently able-bodied, please support your disabled colleagues, coworkers, and anyone you have power over in their quest to do valuable and fulfilling work. I encourage other disabled scientists and programmers to reach out to me with any questions they have.</p>
<hr />
<p><em>Thank you for comments on early drafts: <a href="https://www.cs.jhu.edu/~vandurme/Carrell.html" target="_blank">Annabelle Carrell</a>, <a href="https://www.craiginnes.com/" target="_blank">Craig Innes</a>, <a href="https://twitter.com/uscm_" target="_blank">Matthew Summers</a>, <a href="https://www.dinalevitan.com/" target="_blank">Dina Lev</a>, <a href="https://www.ims.uni-stuttgart.de/institut/mitarbeiter/schlecdk/index.en.html" target="_blank">Dominik Schlechtweg</a>, and <a href="https://americanstudies.yale.edu/people/yuhe-faye-wang" target="_blank">Yuhe Faye Wang</a> (who is in The Humanities!). Thank you to <a href="https://www.recurse.com/" target="_blank">The Recurse Center</a> for providing a private space for me to learn to dictate code. Thank you to my PhD advisor, <a href="https://alopez.github.io/" target="_blank">Adam Lopez</a>, who has unfailingly supported me and made all of this possible.</em></p>
</description>
</item>
<item>
<title>Blackbox NLP Panel Discussion</title>
<link>https://nsaphra.github.io/talk/florence/</link>
<pubDate>Thu, 25 Jul 2019 13:00:00 -0700</pubDate>
<guid>https://nsaphra.github.io/talk/florence/</guid>
<description></description>
</item>
<item>
<title>Get Hooked On Neural Net Inspection! That was a pun!</title>
<link>https://nsaphra.github.io/talk/bangbangwest/</link>
<pubDate>Tue, 28 May 2019 13:00:00 -0700</pubDate>
<guid>https://nsaphra.github.io/talk/bangbangwest/</guid>
<description></description>
</item>
</channel>
</rss>