Programming challenge: boolean parenthesizations

I thought I wasn't going to be able to post a challenge this week, because I'm traveling out of the country (with no Internet connection most likely.) However, last week I implemented the possibility to publish future posts in the blog, and this is how this one was created.

(I wrote the post on Friday, left the country on Saturday, and hoped my code worked to publish this post on Sunday. I'm back this coming Saturday hoping to see that everything was fine.)

For this week here is a pretty cool problem:

Given a boolean expression consisting of a string of the symbols 'true', 'false' and the operators 'and', 'or', and 'xor', write a program that determines all the ways you can add parenthesizes to the expression so it always evaluates to true.

For example, given the expression "true and false xor true" there are two possible solutions:

  • (true and false) xor true = true
  • true and (false xor true) = true

I've personally haven't thought about a solution for the problem yet, but I can easily see how a recursive brute-force algorithm should be pretty simple to create. If you want to make it more exciting, try to think in a non-recursive way of solving the problem.

(This problem looks a whole lot like a dynamic programming exercise.)

In case you are interested, here is the challenge from last week: Programming challenge: towers of Hanoi.

Adam Harris about the RSS feed

Adam Harris asks regarding my previous post about moving the RSS feed generation to a static process:

Why do you generate your rss feed each hour instead of just whenever you publish a new post?

This is a very good question. It was a long time ago when I wrote about the way my blog works behind the scenes, so to anyone reading the last post it might have sounded stupid what I did.

A new post in my blog is just a new file uploaded in the right folder with a specific name convention. Whenever I post something new, I clear the cache, and the engine reloads all the files and the new post shows up.

This means that I don't have a specific signal for when a new post is created (other than when there's no cache and I load all the existing files.) This makes it hard for me to determine when exactly the new RSS feed needs to be refreshed.

Right now it's every hour. I might be able to think in a better way to do this process without having to regenerate the feed every time, but I'm not sure the performance gain will justify the time to implement this.

I'll look into it. It's already in my TODO list for the blog.

Thanks Adam for your question!

Moving from dynamicly generated to a static RSS feed

Before I didn't get too much traffic to the blog. At least not like now.

Things are always easier with few users hitting your code (remember, I wrote the blog engine displaying the text you are reading right now.) One of my goals have always been to not pay for Google App Engine, and host this site as cheapest as I can.

And with more subscribers to my RSS feed, that goal was getting harder and harder to achieve.

I don't know a whole lot about RSS feed readers, but apparently they constantly ping your feed to detect new posts ("constantly" meaning from 10 to 60 minutes depending on the software.) The RSS specification includes a ttl attribute to specify for how long readers can cache the feed, but the readers accessing my site ignored it (or so it seemed.)

So I started to see that most of my traffic was to the /rss path coming from a bunch of feed readers (after Google Reader died, nobody has settle with a "standard" reader.) My feed was dynamically created (and cached) with each request, but this wasn't working in my favor.

I decided then to move the feed to Google Cloud Storage as a static file. Every hour I generate a new feed and save it in GCS as an XML file. All readers are now redirected to that file, so they don't have to hit my App Engine instances.

I removed so much (costly) traffic that it almost feels impossible.

The downside of course is that feed readers can be late up to an hour to receive the latest post from my blog, but I really don't care.

It's been working like this for 3 months now and I haven't heard any complaints.

The lesson here is that things will always change. No matter how "right" you think you are today, and how cool your solution is; scaling your application will bring a whole lot of new problems that will render all your textbook-solutions completely invalid.

Just be prepared for that.

Programming challenge: towers of Hanoi

I'm sorry I missed last weekend's programming challenge. I was on vacation and away from a computer (which was great!) I'll also be absent next weekend since I'll be traveling to Cuba, so I apologize in advance.

But here is the challenge for this week; a classic mathematical puzzle that I find super interesting to solve with a computer:

Given a stack of N disks arranged from largest on the bottom to smallest on top placed on a rod, together with two empty rods, write a program that finds the minimum number of moves required to move the stack from the first rod to the last one, where moves are allowed only if a smaller disk is placed on top of a larger disk.

For example, for 3 disks, the solution requires moving the disks 7 times:

  • Move disk from [1] to [3]
  • Move disk from [1] to [2]
  • Move disk from [3] to [2]
  • Move disk from [1] to [3]
  • Move disk from [2] to [1]
  • Move disk from [2] to [3]
  • Move disk from [1] to [3]

Ideally, your program will print the number of necessary moves together with each one of them.

I'm excited to see what you can come up with! I'm also thinking about not publishing a post with a solution anymore. Turns out that you guys are sending me solutions every week that are better or equally good as mine, so I'll probably tweet one of your solutions instead of writing one. (The exception will be on those problems where I actually want to solve them for fun or to learn something new.)

Update: Here is my solution to the problem, and here is another pretty cool solution that Jose Fidalgo tweeted me.

In case you are interested, here is the challenge from 2 weeks ago: Programming challenge: the position of the element.

Red warning messages are there for a reason

On May 26th Google completely removed OAuth1 support (they deprecated it back in 2012, but everything still worked until last week.) At my company, we have a couple of applications that relied on OAuth1 to access Google Spreadsheets.

Of course, they stopped working. Kinda big deal for us, since we are talking about several reports that started displaying outdated information.

I can't blame Google (they gave 3 years notice before removing support for OAuth1.) I developed one of the applications that went down, so it's my fault for not paying attention to the red warnings all over Google's documentation announcing the final date.

I fixed the problem already. Spent a lot of time, had to change more code than expected, and learned a couple of neat Python tricks during the process (trivial stuff, but I didn't know.) I feel better now about the new way of connecting to the spreadsheets and learned a very important lesson:

Pay attention to the fucking red warning messages!

Won't forget for sure.

20 common mistakes when doing Test-Driven Development

These are based on my own experience as a developer and what I've heard and seen from other people working with me.

(All the notes are from my talk TDD - Making sure everything works, at the Agile Transformation Summit 2015. You can download the full deck in PDF here).

  1. Thinking that your code is flawless: Quit thinking that you are above the rest because chances are that you are not: your code is as bad as mine. We all need help, and TDD is a great way to get it.

  2. Thinking that is all about the tests: It's also about design. It's about confidence. It's also about documentation. In a way, tests are a very nice byproduct of the process.

  3. Asking for permission: Why in the world would you consider asking for permission to do your job well in the first place? And if you are asking, it means that there's a different viable alternative, right?

  4. Not having your entire team onboard: Your team should be able to speak your language and give you support whenever needed. During TDD, you want to keep everyone engaged and contributing.

  5. Pursuing a specific code coverage: Code coverage is just about the quantity, not the quality. It doesn't tell how good your tests are. Are they testing the right thing? Are they easy to read and maintain?

  6. Failing to be consistent: An abandoned test suite is worse than not having anything at all. Is not-useful code that's sitting there needing maintenance and accumulating dust every single day.

  7. Not running your tests frequently enough: If you are not, something it’s really wrong with your TDD process. Your test suite is your thermometer, telling you how you are doing every step of the way.

  8. Failing to define proper conventions: You need solid name, structural, and architectural conventions to keep the entire team speaking the same language throughout the project.

  9. Testing other people's code: By definition, you can't do this. Splitting this process is like having one person drinking a beer, and another person puking when the former gets drunk.

  10. Writing code without having a failing test: This is probably the main problem you'll face. It's a habit that will cost you a long time to break, but eventually the right way will become second nature.

  11. Testing code that’s already written: Sometimes the code is already there, and if so, why do you need to write tests for it? I'd rather not write tests unless I have to modify the original code.

  12. Writing a test before all other tests are passing: Go one step at the time. You can't concentrate in multiple failing tests at the same time. Being methodical will pay off big.

  13. Refactoring before all tests are passing: Don't change code without having working tests. Tests are the only way you have to know whether your changes were successful.

  14. Writing code unrelated to the failing test: If the code is not covered by the failing test, nothing can tell you whether the code works as intended. Unrelated code is untested code.

  15. Testing more than one thing at the time: A test testing more than one thing is a complicated test. Keep tests as simple as possible. If you have too many assertions, you'll have to debug when the test fails.

  16. Writing slow tests: If your tests are slow, you aren't going to run them frequently enough. I'd rather have 99 fast tests that I can run every minute than 99 + 1 slow test that never get run.

  17. Writing overlapping tests: Avoid multiple tests testing the same thing. The clue is getting multiple tests failing with every error. This makes the feedback from your test suite hard to interpret.

  18. Testing trivial code: Only test what could possible fail and forget about everything else. By testing trivial code, you get distracted from testing the code that really matters.

  19. Creating dependencies between tests: It makes it impossible to run tests individually or in a different order. A problem in one test will cause other tests to fail.

  20. Writing tests with external dependencies: This usually produces slow tests. If dependencies break or cease to exist, your tests will stop working. This makes your test suite very fragile and unstable.

I'm sure you have seen more. I'd love to hear about it.

My first public speaking engagement was great!

Yesterday I was honored to give a presentation about Test-Driven Development at the Agile Transformation Summit 2015 (ATS 2015) that took place at Nova Southeastern University in Fort Lauderdale, FL.

This was my first public speaking engagement. I'm very happy (and surprised) with how well it came out, and I hope all the people that was there had as much fun as I did.

(The room was packed beyond what I expected. I'm so glad to see how many people are interested on TDD.)

The major highlight was having James Grenning listening to my talk and nodding throughout it. (James is one of the original signers of the Agile Manifesto, and considered by many one of the fathers of TDD.) When the talk was over, he came over and not only gave me a lot of great feedback, but mentioned how much he enjoyed my presentation, which was extremely gratifying. (And then tweeted about it, which was also great.)

Here is the link to download my presentation.

One of my pet peeves is to look at presentations that don't say much without the speaker, so I took special care to make this one "readable" even if you didn't listen to my talk.

If you want a quick summary of the presentation without downloading the PDF, I also wrote a post about it.

Solution to the position of the element

Here is my solution to last Sunday's programming challenge about finding the position of an element in an array (You can read the challenge here.)

The handwritten notes I took when thinking about this problem:

As you probably guessed, this problem is a whole lot like a regular search algorithm. A simple loop to find the element and return the position of it solves it (or return the position of the first element greater than our target.) I decided to also check for the first and last element of the array to cover those cases separately.

I called this simple loop solution "sequential". You can see the Java code in this Gist (method sequential).

The only problem with this sequential solution is that it doesn't scale well for large arrays: We have to visit every single element in the array, so this solution has an O(n) complexity.

When I published the problem, I asked to think about this case:

In order to make the problem a little bit more complicated, I'd ask you to think about an algorithm that can scale well to sufficiently big arrays.

Since the array is sorted, we can look into a binary search algorithm to find our solution. (Remember that binary search has an O(log n) complexity, which outperforms O(n).)

The method "binary" in this Gist contains this solution. Notice that this is exactly a regular binary search with the only difference that I'm returning the left bound of the array in case the target is not found (instead of returning -1.)

I tried hard to provide the best solution I could in the time I had, but I might very well be wrong (or you might have a better way of solving the problem.) If that's the case, please let me know and I'll update the post.

In case you are interested, here is the challenge from last week: Programming challenge: merging overlapping intervals.

Have you reached all your potential as a programmer?

This is not about what you know, but about how fast you can apply what you already know.

Think for a second how do you always try to solve a particular problem. To oversimplify this, I'm going to divide it in two different groups:

  1. The first group are those people who start from the beginning, and step by step work on their problem until they reach a solution. When they do, they are done. (And everything works, and it also looks good.)

  2. The second group are those people who care the most to reach the end as soon as possible (thus not paying attention to the details along the way.) After they find the right path, they come back and clean everything that has to be cleaned. (And then everything works as well, and it also looks good.)

(Just to be clear, I'm assuming both groups reach exactly the same place at the end. Not only their solutions are the same, but there's also no difference in the quality of their program.)

Granted, I think most people are in (have tried) both categories. I'm also not sure whether one of these is better than the other every single time.

But I try myself to be in the second group every time I can (thus meaning that I like better the second approach.) Here is what this means:

  • When I'm trying to solve a problem, I don't care about anything else than to find the solution (the feedback I need to know that I'm done.)

  • From my coding, I remove every detail that gets in the way of speed. Remember, my goal is to move fast to get feedback as soon as possible.

  • I heavily use mocks, hardcoded values, ugly code, and anything that I need to move on and get to the end.

  • I usually have to ask anyone pair programming with me to be patient: people freak out when they see the tail of debt I leave when I'm moving fast.

  • When I find the light at the end of the tunnel, I come back cleaning: mocks, hardcoded values, ugly code, and every compromise I made gets fixed.

(Some people would tell you that this approach is called "prototyping". It might be, but I don't like to call it that way: the word "prototype" forms a picture of an incomplete product in my mind.)

Why do I like this approach anyway?

Because I've seen how fast it is.

I've worked with people who know it all, and their only difference has been how fast they can get from point A to point B: the time it takes to get things right when you aren't sure yet those are the things you need, adds up against you really quick. Being messy while you find answers, and then taking the time to clean up your mess (the mess that really matters) comes up ahead most of the time.

Is this for you?

Seems so obvious and simple, but I promise it's very powerful!

I've taught this approach to several colleagues before. They've tried and now they can't go back. They feel they work faster now. I definitively know they do.

The key is to constantly evaluate whether you are ready to clean things up or keep moving along. If you've proven your point, start cleaning. If you haven't, don't worry about the details yet.

If you find yourself in the first group most of the time, try this for a change. You might not like it, and that's fine, but who knows? There might be something there for you.

Programming challenge: the position of the element

Let's try something simple this week (because it's Memorial Day week. I promise to rise the bar next week):

Given a sorted array and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order. There won't be duplicate values in the array.

For example:

  • [1, 3, 5, 6] with target value 5 should return 2.
  • [1, 3, 5, 6] with target value 2 should return 1.
  • [1, 3, 5, 6] with target value 7 should return 4.
  • [1, 3, 5, 6] with target value 0 should return 0.

In order to make the problem a little bit more complicated, I'd ask you to think about an algorithm that can scale well to sufficiently big arrays.

As always, feel free to tweet me your solutions and ideas. I will publish mine next Wednesday (and will update this post with a link to it.) Here is my solution to this problem.

In case you are interested, here is the challenge from last week: Programming challenge: merging overlapping intervals, and here is the next challenge in the list: Programming challenge: towers of Hanoi.