The art of reporting incidents

During my PhD, I have been responsible for several pieces of lab infrastructure (data storage, microscopes). To learn how to be responsible service provider, I have watched many videos (Tom Limoncelli, Alice Goldfuss) and even purchased couple of books (best is The Practice of System and Network Administration).

What I’ve learned is that network/system/computer administrators have figured pretty good way to manage infrastructure, manage expectations, and make sure systems with users run smoothly, while minimizing pain for people in charge. We can borrow a lot of this knowledge and reuse it in the setting of research lab.

One aspect of managing systems is response to incidents. That is basically when bad or unexpected stuff happens, no matter the reason. For example, the university network stopped working. Or there is an electrical outage. Or I have done some configuration change that blocked all users from accessing their data.

All of this events have one in common: there is something affecting service beyond original expectations of the user. The first step in remediation, often, must be clear and honest communication of the situation to users, or any interested party (think students who use system but also PI who runs the lab). We often see that this is not done clearly enough, or at a right time, or using the right tools.

Bad way to report incident: piece of paper on the outside of the building; no official emails; no alarms raised inside

Base of my thinking about it was stolen shamelessly from Tom Limoncelli (for example, Radical Ideas Enterprises Can Learn From The Cloud)

The way we inform people of any issue should follow this minimal checklist:

  • Inform in timely fashion, hopefully as soon as issue was discovered and initial assessment was done. It depends on relative risk of the condition. If we suspect gas leak, we should not wait and inform all parties immediately. If fridge seems to be broken, we should first check if it was plugged in before reporting.
  • Be clear about the incident area. “There is an issue with system” is not as clear as “The network connectivity is dropped since 10am”. The purpose of communicating as much as possible is to reassure the everyone that you are on top of things and transparent about what’s going on. Also it removes unnecessary worries, as “Data is unaccessible due to network issue” makes it clear that data is still intact.
  • As you describe what had happen, make sure to include things that didn’t happen (to the best of your knowledge). The network is down, but data is safe. The power in room 123 is off, but emergency power in room 123 is still running, so the microscope is still working. The fridge seems to be broken, but temperature sensor still says -20C.
  • Be clear about what has been done so far to investigate and remedy the issue. “Something happened with lights in room 321, we called facilities” conveys that you are on top of things.
  • Make sure to be clear that you will update people on the issue. It might be not your job to fix the issue, but it is your job to communicate. There is no electricity? Cool, provide a contact for person in charge or be the point of contact. It is OK to delegate or give a way responsibility. “Contacted facilities, please refer all questions to John Doe, as there is nothing I can do” means you managing people’s expectations and provide transparency once again. Ideally, provide time when you will update (“Will report back by 4pm with updates”).

Making sure you check all these boxes in your very first email / report about the incident will allow people to make decisions about their work; it will provide confidence that this incident is dealt with professionally; it will save your time by avoiding people asking question like “has X been affected” and “who should I contact about this”.

How to write top-down

Just had a conversation with fellow PhD student about writing experimental design, or Methods section. Top-down principle suggest starting with very broad strokes and go down by refining each item. In our case it looks something like that.

Draft 1:

  • Mount fish
  • Image fish
  • Present stimuli
  • Process data
  • Analyze data

Let’s assume, we’ve decided that is enough writing for the day. But tomorrow we can come back and have energy to work on one of these item. For example, numbers (2) and (3). Our Methods section is turned now into:

Draft 2:

  • Mount fish
  • Image fish
    • Prepare microscope: select filters, laser lines
    • Set up imaging parameters: Z resolution, XY resolution, temporal resolution, scanned Z extent, temporal extent
  • Present stimuli
    • Add projector to the microscope
    • Program projector to present images to the fish
    • Set up parameters: intensity, time delay, randomization seed etc
    • Present visual stimuli A, B, C
    • Control for intensity
    • Control for spatial location of stimuli
  • Process data
  • Analyze data

This allows us to work on part of the bigger work, instead of trying to write whole section at once. It also allows making small notes on what should be there, so it is easier to remember and fill-in the details later.

Draft 3

  • Mount fish
  • Image fish
    • Prepare microscope: select filters, laser lines
    • Set up imaging parameters: Z resolution, XY resolution, temporal resolution, scanned Z extent, temporal extent
      • Z resolution: 5um for single-cell resolution, 0.5um for sub-cellular imaging [reference; reference]
      • Camera pixel size is 6.25um, so if we use 110mm tube lens we have 32X magnification with 20X Olympus objective (default tube lens is 180mm) [reference]. Pixel size then 6.25um / 32x = 0.19um = 190nm
      • Temporal resolution for nuclear GCaMP is ~5sec [reference]
  • Present stimuli
    • Add projector to the microscope
    • Program projector to present images to the fish
    • Set up parameters: intensity, time delay, randomization seed etc
    • Present visual stimuli A, B, C
    • Control for intensity
    • Control for spatial location of stimuli
  • Process data
  • Analyze data

Each pixel of presentation costs you $$$

Reprinted from dev.to

Which one of these images have higher pixel-to-information ratio?

Today I’ve been going through slides with fellow graduate student and came up with an analogy:

Each pixel of your slides costs you money

Now, consider how much you get back from each of the pixels you put out there. Was it really worth it? How much information did you provide, and how much have you paid for it?

In a sense, it is similar to “Data-ink ratio” idea introduced by Tufte. Difference is that “money” approach brings actual numbers to the stage.

When you put stuff on a slide, or on a plot, do you really get the most bang for the buck?

Most of the time, especially for the first 4 drafts, the answer is “no”.

How to get higher return on pixel?

  • Simplify: slide has to carry a message, how much simpler can it get?
  • Split: there has to be single message, can you say one thing at a time?
  • Squestion: does this has to be here?

Not only StackOverflow: Making use of Stack Exchange

We all know and love that great Q&A website, stackoverflow.com (or SO)

Well, we all should know and love it. But not many people I’ve met are aware of great variety of communities built using SO platform. Here are some of the favorites.

academia.stackexchange.com

Ask or search existing questions for anything about higher education:

  • how to deal with publications
  • finding and communicating with advisor
  • working on group projects
  • how academic departments work
  • what is expected from students, postdoc, RAs, TAs etc

WORKPLACE.STACKEXCHANGE.COM

More general site for discussion of anything related to professional workplace. After all, we are working in academia and science too. This is perfect for students, trying to transition into research, since it covers:

  • relationships in the team
  • how to ask for raise, promotion
  • what is expected from professionals
  • “Is this behavior professional and how should I deal with it?”
  • workplace issues, conflicts, harassment
  • management “up” and “down”

interpersonal.stackexchange.com

Less known, but still useful site focussed on interpersonal relationships. There is a trope that some academics are not so good at communicating with friends and family, and this site might help.

codereview.stackexchange.com

For when we are done hacking our code and would like to make sure that it is at least reasonably written. Software is often a part of scientist’s job, so it should be done reasonably well

Making best PPT: advice from @AcademicChatter

There are a lot of ways to make presentation better. And we should, since since is at least 75% presenting and talking about our work. Here some recent ideas from discussion reported by @AcademicChatter:

And a sad investigation when slides go wrong

Extreme Sciencing manifesto

This page describes set of principles we try to follow in our day to day science job. These ideas are stolen from many areas, mainly from software development practices.

We try to follow these methods ourselves and implement them in our teams

Latest version can be found and forked on Github: https://github.com/aandreev0/extremesciencing

Key elements

  • Agile: main goal is successful research, not following some sort of “best practices” list
  • Rapid: We aim to move swiftly through the project, figuring bottlenecks as soon as possible
  • Responsible: We acknowledge that mistakes will be made and proactively look for ways to lower risks

Notes on research

Result of sciencing is clearly presented high-quality scientific results. It can be delivered as statistically significant observations of nature. It can be delivered as novel useful experimental protocols or data analysis tools that help discovery.

Notes on engineering

Central idea for engineers is that they design and deploy products, something tangible that will be used by other people. This could include microscopes, behavioral experimental setups, analysis and control software, data management solutions, and other elements of research environment.

This means that projects should be treated in such a way that:

  • Development should be based on written specification, result of constant conversations between users and engineers
  • Final product includes evolving documentation
  • Final product includes documented way to track issues and users’ requests for new features, as product will contain bugs, but also evolve

Notes on people

  • We acknowledge that being people is hard and painful.
  • We acknowledge that talking to people is hard and painful but it is the only thing that ever moved complex projects forward sustainably

References and reading

Contributing authors: