Wednesday, July 31, 2019

Professional Determination or How to Solve Problems

Since I was little, my mother always told me "you should learn, learn and learn" and "knowledge is power".
Not so long ago, I truly believed that knowing a specific technology, or every detail of the system are the core of professionalism. Over the years I discovered that there are skills which are more important than pure knowledge, most notably the skill of solving problems.
This skill brings one to the next level and differentiates senior developers from junior ones.

The concept is so powerful because many of its methods do not require any proficiency.
Mastering the skill of problem solving requires experience, but the good news is that you will encounter plenty of things that won't work for you to solve along the way.
Such problems can appear during the development process, and unfortunately even at production.

So what is the problem? Sometimes we don't strive for a solution:
  • We give up too quickly
  • We get stuck and try the same thing over and over because it’s supposed to work (to quote Einstein: "Insanity is doing the same thing over and over again and expecting different results")
  • We tend to blame others or pass responsibility to others. Preferably those who can't defend themselves (like a CI agent or a shelf products), or the usual suspects (like other teams).
  • We believe that someone else will help us solve it.
  • We look under the streetlight
  •  We find a reasonable enough explanation and do not bother to prove it.
    • voodoo is not a good enough explanation...
    • Bug in an external product? That explanation was acceptable by me only if we found a matching open issue in the product's docs.
  • We don't put enough effort in reproducing the issue
  • We don't know how to approach it ….

So I propose the following problem solving methodology:
  1. Define the problem
    The most important task when approaching a problem, is to define what the problem is. It's common knowledge that "a problem well defined is half solved", and moreover: different problems have different solutions.
    This seems trivial, however tricky in practice.
    For instance, I get up in the morning, I need to get to work, and my car won't start. What's my problem? Surely, the car that won't start. No! My problem is that I need to get to work. If I try to solve the first, I can get the car to the garage, I can call someone to help me with the car, I can read the manual and try to figure it out myself. But, if my problem is getting to work, then I can take a bus, or ride with a colleague.
    Focus on what you need (getting to work), and not what you want (the car to start).
    (Though, don't forget to fix your car later on, or you might add other problems… 😊
    )

  2. Gather information
    Gather as much information as possible. First collect the information and then determine what you can do with it. It's important to do this as soon as the problem is discovered as some of the data may not be available anymore at a later stage.
    Such information can be:
    • Concrete examples that don't work
    • Backup Logs, stack trace, core dumps
    • Settings: environment variables, command line invocation, configuration files
    • When did it last work? (If such time existed)
    • What is the system's flow? The problem probably hides somewhere in it
    • How does it look like when it's working? Compare it to the not-working state
    • What changed from when it last worked??

  3. Find the root cause

    • Reproduce and debug
      Reproducing the error enables gathering more information. Either by debugging, or adding additional logging. This will help us understand why something happened the way it did and what went wrong.
      If we are lucky, not working things consistently won't work. It's much harder to debug something that doesn't always happen. 
    • Elimination
      Another method to pin-point the problem is by comparing a working case to a non-working case. If a functionality used to worked at a certain point in time, and now it doesn't, then there is a point where it transformed from a "working" to a "not-working" state, we only need to locate that point in time. This can be done theoretically or practically by either removing parts of the code or copying parts of the code to a new environment, or by bisecting commits. (Is it only me or this reminds a bit of the intermediate value theorem? 😇). Note that this is basically a brute-force method and it does not require any deep knowledge about how the code works. Therefore this is a very powerful methodology, where it enables you to solve problems with minimal knowledge, just trial and error.
    • Play the detective
      This method is useful when solving problems that are hard to reproduce in development and all you have is the information you gathered.

      Make hypotheses and prove or disprove by the symptoms, just like Sherlock Holmes. Note that even lack of symptoms is a symptom. Just like in Sherlock Holmes' "Silver Blaze" story where a dog not barking revealed that the criminal was someone familiar to the dog. Similarly, in our case, for instance, a missing log line can also tell something about a log that initially looks clean from errors.

      In addition, be suspicious. If something doesn't seem right, even if it's not a clear error, it might suggest a problem worth investigating.
Finding the root cause can be time consuming, so it's important to understand when you've reached a dead end.
If you feel like you're stuck, try something different, switch methods or stop and take a step back to think if the problem has been defined properly.


  1. Find the solution
    • Google
      Simplify and generalize to summarize a question for google, as general as possible. Try to eliminate the specifics as much as possible.
    • Handle the cause
    • Focus on what you need, not what you want
      For example, you are conducting integration tests with another team and their environment becomes unavailable. You might want them to fix that environment, but actually what you need is an environment, and not that one specifically.
    • Handle the symptoms
      It's important to note that sometimes handling the cause is not the best course of action in the short-term period. That may be the case when the problem is critical and when you can't develop or deploy the solution that quickly, or even haven't found yet the actual cause. Meanwhile, there's a problem waiting to be solved, no matter what the cause is.
      In that case you need to find a solution that is good enough for your problem, even though it might not be what you intended at the beginning. Beware of the difference between a working solution to a "seems to work" solution. Also, don't forget to revisit the problem later on and handle the cause.
    • Communicate
      Maybe you are not the only one in your group experiencing these specific issues, and others can help.

I believe everything is solvable and the world does eventually make sense, which implies there is a logical explanation behind anything which appears not to work properly, no matter how unreasonable things might seem to start with. Once we realize that it is up to us to solve it, and not someone or something else, there is nothing that can stop us from finding a solution (except maybe unknown bugs in external products...)
It doesn't really matter who's fault is it (counterintuitive, perhaps, to human nature – we want someone else to be blamed instead of working together for the cause), the only thing that matters is that you have a problem, and you want it to be solved.  So it is your own interest to get it solved.


This is what I call professional determination.

Have you encountered any interesting problems recently? How did you solve them?

Wednesday, July 10, 2019

The Fundamental Step Towards a Self Managed Team


I have recently heard a saying that "a perfect manager is the one that you don't need".
At first, such a saying might be interpreted as offensive, implying this manager is useless, which is the reason he (or she) is not needed. But, let's think about it, a manager that can orchestrate everything such that the team can manage without him is a damn good manager.

A self-managed team is a manager's dream. Obviously, it does not mean that there is no need for a team leader, but a team that achieves self-management allows the manager the so-necessary time to focus on the important things and guarantees that tasks are still getting done appropriately while he is absent.

When considering self-managed software teams we can discuss issues such as the team's collaboration, commitment, ownership and so on.
However, a more fundamental step towards a self-managed team is to reduce the team's daily dependency on the team leader.

In this case, we need to ask ourselves, when does the software team need its team leader?

1.    Task definition
The team leader sees the broader picture and is aware of future directions because he takes part in all relevant meetings. However, although he knows what needs to be done, this isn't always properly reflected back to the team resulting with the team missing the vision. For a team leader, when it comes to defining tasks, it's the easier to write a single-liner task knowing what stands behind it. The problem is the team cannot work with it without further explanation that can be done only by the team leader.
 
2.    Tasks allocation 
Without proper task allocation team members might find themselves working on less important tasks, or having nothing to do at all. Not to mention the possibility that multiple team members may work on the exact same task.
If the team leader is the only one to know what the required tasks are and what work is already in progress, then the team leader is the only one who can allocate new tasks to the team members. The allocation is also affected by the team members' proficiency and the team leader's management methodology.
 
3.    Making decisions
When the tasks are being detailed, it's easy to forget why they were originated, that is, what purpose they serve in the bigger picture. which makes the team leader the only one that can make conscious decisions concerning them.
In addition, he has the experience and the authority to make professional decisions, and the team members might find themselves waiting for the team leader to decide.
 
4.    Reviewing the tasks
In order to perform a meaningful review the reviewer needs the professional knowledge and review the task in light of its requirements. Because of that, a team leader might find it hard to allow others approve tasks prior to the delivery. Whether it's because he is the only one who knows the purpose of each task, or it's because he is the professional authority.
In this case he might quickly become a bottleneck for task delivery, if he is the only one that can review and approve the tasks.
 

What can be done about that?


Have a well-defined backlog. Such backlog that will contain the tasks and a well-defined methodology for working with it can answer the team's daily needs.
It's true that having such a backlog requires a lot of ongoing work, even when apparently there is no time for it. However this is the way to reduce the dependency on the team leader and is the base for the team to work as a team.

A well-defined backlog sums up to three main things:

1.    Task definition
The focus here is on how to define the tasks, after they've already been determined (how to break down the tasks has an entire separate doctrine that alone can fill a whole post).
When the tasks are properly defined, the team leader isn't the only one that can understand the tasks. It helps the team make conscious decisions because they are aware of the original needs. It enables the team become more independent, think of the solutions and judge whether they meet the requirements during development and review.
·       Tasks should reflects the needs- Not only what we want, but what we need. Tell the story behind the story
·       Tasks should be self-contained- everything the team needs to know about the task should be stated, in a way that even a developer new to that feature will understand what needs to be done. Tasks should contain links to other sources, if required.
·       Definition of done is clear- The criteria for determining that the task is complete. The theory behind this can be the subject of a post of its own.
·       It can be good to include some team members when defining the tasks, to make sure they understand it, and that nothing is missing

2.    Backlog grooming
This means ongoing work to make sure the backlog is ready to work with.
In this way when a developer needs a new task he can take the task with the highest priority that is unblocked.
·       All tasks should appear in the backlog
·       Tasks need to be prioritized
·       Dependencies should be reflected in the backlog

3.    Backlog Methodology
·       Have a board that reflects what the team is doing
Tasks can either be in TODO, BLOCKED, DEVELOPMENT, REVIEW or DONE.
In this way team members can take tasks without worrying whether someone else is already working on them. Furthermore, there is always an updated status to whom it may concern
·       The team makes decisions together
·       Team members perform code reviews.
A code review is just like another task. When a developer needs a new task, he can review another task if it's prioritized. The priority of the review is derived from the priority of task.
We need to aspire that everyone will be able to perform code review, and it's our task to train junior developers to be proficient at it. This in turn reduces bottlenecks and increases the team's knowledge of the system. By emphasizing reviewing the tests, even a developer who isn't familiar with the implementation of a certain part of the system can still review changes to it.
·       The team is versatile- It's not a must, but it does significantly contribute to the ability of the team to work as a team, in contrast to a group of individuals

Surely, it's not easy to implement all of those things, but certainly worthwhile and facilitates the team's progress towards self-management.

How is it on your team?



Wednesday, April 19, 2017

TOP 3 Effective Code Reviewing Checklist

A few years back I composed an extensive checklist for things to consider while reviewing code. I started compiling this list after I saw that many code reviews were not as effective as it could be.
Each of the reviewers focused on things they believed were important (which is, of course, individual) or mainly commented on cosmetics (the easiest things to spot).
Lately, I was asked by the unit director to give my suggestion for "TOP 5 Effective code reviewing checklist". The first thing that came to my mind was "how is that even possible?" Reflecting on my past checklist, I was certain that are so many things worth considering. However, "if there are more than five items in the list", they said, "no one will follow it".  And they had a point. No one was using my checklist. Not even me…
After contemplating, I realized that I can summarize almost everything required for an effective code reviewing in this following TOP 3 questions

1.      IS THE CODE WELL TESTED? 

I'm not a fan of Test Driven Development (TDD) methodology, at least not in its extreme (Writing only the code that passes a single test). Instead, I suggest something else: Why not start the review by reviewing the tests first? You can call it: Test Driven Code Review.

This question forces the reviewer not only to understand what the reviewed feature is supposed to do, but to perform the review in light of the requirements. This is important especially in teams in which not every programmer knows every feature in the software.  Sadly, It is easy to review lines of code without understanding the bigger picture, and relying on the programmer that he (or she) adresses the requirements. Moreover, as it is easy to write procedures that test the written code (and not the requirements), it is easy to review tests in light of the existing code. This is handled when reviewing the tests before reviewing the code (and can help in preventing fixation and bias on part of the reviewer).

Secondly, I cannot stress enough that tests are the most important part of software development, in my opinion (I will save that for another post…). As such, it only prompts to be checked first, with the benefit of the reviewer being fresh and sharp in thought, rather than reviewing it last when the reviewer might feel tired and that most of the review is behind him/her. I truly believe that having the programmer know that the tests are the main focus of the review will also make a strong effect on the programmer.

Lastly, a common say is that "it is hard to maintain the tests". It can be true, but we should not cave to this… We need to demand the same high quality from our tests as we demand from the source code, mainly - easy to extend or modify. Therefore, it is important that the test code will be treated the same as the source code when we come to review it. Thus, readability and usability are important for the test code as well.
This question (that is, is the code well tested) along with automated build and code coverage, guarantee that
a.      The feature does what it's supposed to do.
b.     The feature doesn't do what it's not supposed to do.
c.      The code does not break existing features.
* Note that there are different levels of tests that need to be considered: unit, component, integration, performance and load tests and each level should receive the proper attention

2.      IS THE PRODUCT EASY TO MAINTAIN? 

This question addresses issues of support and maintenance after the feature is deployed, along with the entire product. I think this point doesn't get the proper attention as it deserves. I suggest the reviewers ask:

a.      Is the flow understandable from the logs? I recommend reviewing actual log files and not only loggings in the code
b.     How well are errors handled?
c.      What alerts are possible?
d.     Is it easy to monitor and control the feature?
3.      IS THE CODE EASY TO MAINTAIN?
Since we already know that changes are imminent, we need to verify that new code is easy to extend, change or fix, even by a different programmer than the one who coded it. This requires us to check its readability and design via means of:
a.      Coding conventions.
b.     Design matrices, such as: coupling and cohesion.
c.      Proper documentation.
d.    Readability, meaningful comments (that explains why rather than what).
These are my TOP 3 recommendations for an effective code review. But, since the world is not perfect, and since our tests will never cover everything, I suggest two additional (secret) bullets:

4.      DOES THE FEATURE WORK?
In my original checklist, this was my first and most important bullet. The reason why it is only fourth now (and secret) is that, intuitively, this bullet is supposed to be answered by the first bullet- if the feature is well tested and the tests cover the functional requirements, then the feature works. But, in practice, the tests will be incomplete and another level of quality assurance is required. Reviewing the code itself, seeking for bugs and making sure that the code covers all functional requirements, provides that additional assurance. Moreover, some of the bugs are really easy to spot in the code, yet are very hard to test (for instance, concurrency problems).
5.      DOES THE FEATURE HAVE SIDE EFFECTS?
While conducting a code review it is tempting to review only the lines of code that changed, rather than all the code parts that relate to the changed code. This is mostly because we use comparison tools to perform the code review which results in ignoring the related code that did not change.
If the entire software is tested properly, then this question is supposed to be covered too. In addition, well designed code will reduce unwanted side effects of code altering. Otherwise, we should check how the feature interacts with existing features, for instance:
a.      Do other features that use the altered code still work?
b.     Do assumptions change?
c.      Does the product's performance change?
To summarize my checklist for an effective code reviewing process:

1.      IS THE CODE WELL TESTED?
2.      IS THE PRODUCT EASY TO MAINTAIN?
3.      IS THE SOFTWARE EASY TO MAINTIN?
4.      (DOES THE FEATURE WORK?)
5.      (DOES THE FEATURE HAVE SIDE EFFECTS?)
Have a happy and productive code review!

Popular Posts