| Welcome to Crypto. We hope you enjoy your visit. You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free. Join our community! If you're already a member please log in to your account to access all of our features: |
| Vague ramblings on Unit Test frameworks, Mocking frameworks, and Inversion of Control containers | |
|---|---|
| Tweet Topic Started: Feb 26 2014, 01:00 PM (1,173 Views) | |
| jdege | Feb 26 2014, 01:00 PM Post #1 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So, you're working on a program. What do you do? Write all the code, for everything, from beginning to end, then hand it off to the compiler and see it work, first time? Of course not. As John Gall wrote, back in the Dark Ages (1975):
So what do you do? You create a simple system that works, and the add to it, growing it to cover more and more of the requirements, in small steps, maintaining a working system all along the way. Traditionally there were two basic approaches - top-down and bottom-up. In the top-down approach, you begin with your main routine, and start with the higher-level functions. Your higher-level functions would depend on lower-level functions that didn't exist, yet, so you'd write stub routines that took the right arguments, and returned a reasonable value, but didn't actually do anything, so that you'd have a system that would compile and run. Then you'd expand those higher-level functions, implementing the code and creating lower-level stubs so that it would work. And after every iteration, you'd build and test, and make sure that the code you had was working the way you'd intended. You'd be testing, and writing code to support your tests, as you went along, but you'd be replacing the supporting code, and throwing it away, as you progressed. When you were done, you'd have a working program, and all of your test stubs would be gone. In the bottom-up approach, you worked from the other direction. You'd identify pieces of low-level functionality that you'd need, and you'd implement them. Then you'd create scaffolding around them, so that you could call your low-level functions and ensure that they'd work. You'd continue by building higher-level functions, using the lower-level functions, building scaffolding for them, and tossing the scaffolding you'd used, You'd still be testing after every iteration, though instead of exercising an incomplete application from the top, you'd be running some number of scaffold applications. And in the end, you'd have a working application, and you'd not have working tests. In practice, most applications were built using a combination of these, building core functionality bottom-up, and then flushing out the UI from top-down. It didn't really matter, the result was an application that was almost always difficult to maintain. The testing stubs and scaffolding that had been so useful during development were gone. In 1998, Kent Beck published a paper describing SUnit, a framework for managing automated unit tests in SmallTalk: Simple Smalltalk Testing: With Patterns. People jumped on the idea - we'd write test suites as we worked, running them repeatedly, after everPy change, so we'd be able to tell immediately whether our changes had broken existing functionality. SUnit became JUnit, in Java, CUnit, in C, and it wasn't long before people were talking about xUnit, as automated testing frameworks began to spring up for every language imaginable: http://en.wikipedia.org/wiki/List_of_unit_testing_frameworks His suggestion? That as we built the individual components of an application, we'd simultaneously build a library of test functions, that would call the methods of the individual components with known arguments, and would check whether the results were what was expected. That we'd have a framework under which all of these tests could be run, that would report which of them failed. And that we'd make running the entire set of tests as a normal part of the build process, so we'd be constantly testing to ensure that what we were doing hadn't broken the existing code. And all of our problems would be solved. Except, of course, that they weren't. It turns out that having an xUnit framework is only one of three essential technologies that is essential to writing code that is testable and maintainable. The other two are having a mocking framework and having an Inversion of Control container. These work together, synergistically, to create capabilities far beyond what any two of them can accomplish alone. The reason that unit testing doesn't work, by itself, is that modules depend upon each other, which makes it difficult to test them in isolation. In another thread, I recommended a video: Inversion of Control from First Principles - Top Gear Style In it, Andrew Hartcourt uses as an example a function that behaves differently on Tuesdays and asked how could you test that it would work right, without waiting until Tuesday? If you're running a desktop app, changing the system clock is a possibility, but if you're running on a multi-user machine, or on a web server, changing the system clock might not be feasible. The example I usually use is a login page. It's a simple enough functionality - accept username and password, encrypt the password, read the user's encrypted password from the database, and compare. But how do you test? You need a database server, and a database, and a user table. populated with the specific user records our tests expect. And suddenly setting up and running our tests becomes a major effort. The problem is that most pieces of code depend on other pieces of code, which have to be present in order to run them. Because of this, unless you make a concerted effort to manage your dependencies, you end up with modules that cannot be tested independently, but can only be tested as an integrated unit. That can only be tested, for example, when the database is present, with a particular set of test data loaded. Which makes for tests that take too long, and especially that are too much work to set up, to be run with the frequency that we'd like. To be continued.... Edited by jdege, Feb 26 2014, 01:36 PM.
|
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| novice | Feb 26 2014, 03:03 PM Post #2 |
|
Super member
![]() ![]() ![]() ![]() ![]() ![]()
|
Makes very interesting reading. A few key things i picked out: "I recommend that developers spend 25-50% of their time developing tests." Kent Beck That is an eye-opener. I have never kept a log of how much time I spend on programming an application and of that how much time is spent correcting faults. But now that I reflect on it, I probably spend at least 75% of my time sorting out mistakes and finding bugs. If that sounds very amateurish, that's probably because I am an amateur. However I realise now that I should be developing tests! "The reason that unit testing doesn't work, by itself, is that modules depend upon each other, which makes it difficult to test them in isolation." jdege "The problem is that most pieces of code depend on other pieces of code, which have to be present in order to run them. Because of this, unless you make a concerted effort to manage your dependencies, you end up with modules that cannot be tested independently, but can only be tested as an integrated unit. " jdege. Yes. It can be very frustrating to sort out problems in one module only to find that the problem is just passed onto another. And interdependencies are pretty much universal in the cipher solving programs I write. So your next instalment is eagerly awaited. |
![]() |
|
| mok-kong shen | Feb 26 2014, 04:46 PM Post #3 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
http://en.wikipedia.org/wiki/Unit_testing contains IMHO a good overview of unit testing. |
![]() |
|
| jdege | Feb 26 2014, 06:35 PM Post #4 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
That it does, but as I said, unit test frameworks on their own don't address the larger problem. The issue isn't how to write tests for our code, but how to write code that can be easily tested. And that turns out to provide significant benefits beyond testability. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| mok-kong shen | Feb 26 2014, 08:21 PM Post #5 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Very true. The issue of doing good software design has been approached, as far as I am aware, in a manifold of directions, including better PLs (OOP etc.), modelling languages (UML etc.), design patterns, development environments, program synthesis, formal verification, etc. etc. On the other hand, there seems to exist no unique "philosopher's stone" for software design (and the related testing). One woud be very rich, if that could be found, I believe. Edited by mok-kong shen, Feb 26 2014, 08:52 PM.
|
![]() |
|
| jdege | Feb 27 2014, 05:03 AM Post #6 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So, we were discussing dependencies.... At the lowest level, you write procedures, subroutines, or functions, and these call each other, creating dependencies between them. When you group functions together into classes and objects, there are dependencies between them. Bundle your classes into packages or libraries, and there are dependencies between them. The first rule of software design is that, if you chart out the dependencies, they must form a Directed Acyclic Graph. If you allow a cycle where A depends on B, and B depends on C, and C depends on A, you've essentially eliminated A, B, and C as independent entities. They can't be built, used, run, or tested, except as a group. And that was state-of-the-art, back in the seventies - high-level modules would depend on low-level modules, and all would be good. Except it wasn't. True, if you violated this, you'd get into trouble real fast, but if you followed it, things weren't much better. Your lowest-level modules were reusable and testable, but they made up a small portion of your code. The higher levels were increasingly less and less reusable and harder and harder to test as they depended on more and more. Then came the Object-Oriented Revolution. And with it the Dependency Inversion Principle. High-level modules should not depend upon low-level modules, both should depend on abstractions. Suppose, for example, you're writing a simple command-line ROT13 cipher program. It reads from the console, shifts letters by 13, and writes to the console. Such a design would work, but would be inflexible. Your encryption function would depend on the low-level console read and write functions. A more flexible design would be, instead of using the low-level console read and write functions directly, to define a reader and a writer abstraction. (In some languages, this would be called an interface, in others an abstract class, the details aren't important to the concept.) You'd then create classes that implemented your reader and writer abstractions, and used the low-level console read and write functions. Instead of having your higher-level module - the encryption function - depend on the low-level modules, you'd have it depend on the abstractions,, so it would work with any reader and writer. Your readers and writers would also depend on the abstractions. This has two advantages. The first is that your encryption function becomes more flexible. You can use it to encrypt a file, or a string extracted from a web page, or whatever, and write to a file, or a web page, or send an email or a text message, so long as you can build a reader and/or writer. The second is that your encryption function becomes testable. When it was reading from the console, the only way to provide it with data was to type it at the keyboard. With it now relying on an abstract reader, you can create a reader that provides it with exactly what you want it to have in your test. But this comes at a cost. The original design only worked with the console, but you didn't need to give it anything in order for it to work. The more flexible design would work with any reader and writer, but you need to create your readers and writers, and you'd need to tell your encryption function which reader and which writer it was supposed to be using. Both of these are real problems. Which is what lead us to Dependency Injection. To be continued.... Edited by jdege, Feb 27 2014, 05:58 AM.
|
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| mok-kong shen | Feb 27 2014, 05:13 PM Post #7 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The No Free Lunch Principle applies erverywhere in life, unfortunately. What I would like to mention is that the design of software highly depends on the project (nature of task, scope, cost, impact in case of deficiency, etc.). Take an analogy: The design of a bicycle and that of a Mars vehicle need evidently different amounts of carefulness. One shouldn't ignore the economic aspects IMHO. Huge software projects and little application programs would naturally require different genre of design disciplines and design support software and methods of test and audit as well as project and personel management etc. etc. BTW, a link that could be relevant for the present thread: http://en.wikipedia.org/wiki/Cleanroom_software_engineering Edited by mok-kong shen, Feb 27 2014, 05:49 PM.
|
![]() |
|
| jdege | Feb 27 2014, 06:47 PM Post #8 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
There's no doubt in my mind that the level of process sophistication that is appropriate for a project varies widely, depending upon the project's size, complexity, schedule, and expected lifetime. Nobody would consider it reasonable to configure a Continuous Integration server for a 20-line python script. That said, I think most amateur programmers would be well-served to understand the basics of version control and of unit testing. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| fiziwig | Feb 27 2014, 07:34 PM Post #9 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Of course the FORTH language was built around this approach right from the start. See, for example: http://www.forthfreak.net/thinking-forth.pdf |
![]() |
|
| jdege | Feb 27 2014, 09:53 PM Post #10 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for that. It was one of my favorite books, back in the day. The first chapter gives a clear exposition of the development of software style through the seventies. But it gives short-shrift to the advances that have been made since the late 80's. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| novice | Feb 27 2014, 11:03 PM Post #11 |
|
Super member
![]() ![]() ![]() ![]() ![]() ![]()
|
I agree with that and look forward to your next instalment. |
![]() |
|
| jdege | Mar 1 2014, 04:27 PM Post #12 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So, we've been talking about dependencies. What. exactly, is a dependency? In simplest terms, it's where one bit of code accesses a bit of code or data that lies outside itself. If I'm writing a function, for example, the code within the function might access variables that are local to the function. We would not consider that to create a dependency on those variables, because we consider those variables to be a part of the function. If, though, the function accessed variables that existed outside itself, we would consider that as creating a dependency. Consider a function that calculates the area of a polygon, given a set of vertices. If it stored those vertices internally, it'd have no dependencies, but it would only calculate the area of the specific polygon it contained, which wouldn't make it very useful. For it to do anything of consequence, we would have to some way of indicating which polygon we wanted it to calculate the area of. We could do this in any number of ways, passing them as arguments, setting a particular set of global variables, writing them to a file, that the function would read, blinking the scroll-lock LED on the keyboard, and reading them from the computer's web-cam, etc. The dependency isn't in how the data is obtained, it's that it needs to be obtained. And our function wouldn't have a dependency only on the data - it would also have dependencies on the data types in which the data was provided. If, for example, each vertex was an instance of a Point class, our area function would have a dependency on the Point class. The definition of the Point class would have to be accessible or the function wouldn't work. If the set of vertices was provided as an instance of a List class, we'd have a dependency on List. If all of this seems rather vague and circumstantial, that's because it is. We worry about dependencies because they create something we need to be aware of when we build a program. Strictly speaking, any access of a variable or of a function creates a dependency, but we generally restrict our usage of the word to those situations in which the access creates a possible issue we need to worry about. If I'm writing in an object-oriented language, it'd be perfectly normal for me to access member variables or to call member methods of a class from within a method of that class, and that'd not be considered creating a dependency. So if I were to define a Polygon class that contained as a member a set of vertices and an area() method that worked on those vertices, we'd not consider their to be a dependency between the method and the member. There would, though, be a dependency between the Polygon class and the Point and the List classes, and between Polygon and whatever was the original source of the vertices. So, what kinds of dependencies are there? In the world of object-oriented design, we sometimes talk about the relationships between classes as "is-a", "has-a", or "uses-a". Class Polygon might inherit from a base class Shape. That would be an instance of "is-a", and would constitute a dependency. Polygon might contain a set of vertices. That would be a "has-a" relationship, and would also be a dependency. Polygon might have a draw() method, that when passed a Canvas object as an argument would draw itself on the Canvas. That would be a "uses-a", and would also be a dependency. We generally don't worry about "is-a" dependencies, except when deciding what we need to include in the compile or link. The code needs to be available, but there's never an issue of which Shape class our Polygon class derives from, "is-a" indicates a relationship between classes, not between objects. The other two, "has-a" and "uses-a" (aka "composition" and "association") involve relationships between objects. If an object "has-a" member of another class, it is responsible for its life cycle. Our Polygon "has-a" member "vertices" of type "List<Shape>". It creates the list when it is created, it destroys it when it is destroyed. If an object "uses-a" something, it is not responsible for its lifecycle. It doesn't created it, it doesn't destroy it, it simply access it in some way and uses it. Generally speaking, the "has-a" relationships of a are simpler, in that we don't need to worry about them. in order to use the class. When a object uses an object outside of itself, we need to provide it with that object. Objects that a class creates by itself, and controls by itself, we needn't worry about. Except during testing. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| novice | Mar 1 2014, 06:30 PM Post #13 |
|
Super member
![]() ![]() ![]() ![]() ![]() ![]()
|
Although my programming is not very complex I do have to worry about dependencies -- though I am not sure they are of the same type you are describing. Typically one of my cipher solving programs will have these functions: -remove spaces from the ciphertext and validate the characters; -load the scoring data of the correct type (eg language. N-grams); -make a random key or keys, depending on the cipher; -run one of the library of stochastic algorithms to develop the key; -decipher with a given key; -define the fitness of the key by assessing the quality of the plaintext derived from it; -report the results when a fitter key is found. There are some important dependencies. For example for a given cipher type the correct deciphering algorithm must be applied. Again, the scoring data loaded must be of the appropriate language. If by mistake I call data for Catalan when I'm trying to solve a German cipher the result won't be too good. So your unfolding narrative is still of interest to me. |
![]() |
|
| mok-kong shen | Mar 1 2014, 09:05 PM Post #14 |
|
NSA worthy
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OOP no doubt has many enthsiasts and can, I believe, be suitable and advantageous for quite a number of task environments. It is on the other hand interesting to see the "Criticism" section of http://en.wikipedia.org/wiki/Object-oriented_programming OOP can't be a panacea in software engineering IMHO. (For otherwise no new non-OO PLs would come out.) Edited by mok-kong shen, Mar 2 2014, 09:14 AM.
|
![]() |
|
| WTShaw | Mar 2 2014, 08:56 PM Post #15 |
|
Advanced Member
![]() ![]() ![]() ![]() ![]()
|
've been away for a few days but I'm back. in my history of programing, I suppose that I have made every error possible, made things work, and in the current endeavor sought to replace the bad with the good in a systematic way. It has been nothing to retrofit dozens of programs to a better ideal. I usually start with something that works and visualize what might be changed or at least tried to reach a new goal. While brute force means testing all the keys of an algorithm, my idea of brute force is to wrangle output of the current state of a manipulation after every or very few new lines of untested code, see if what I want is happening, and try some more. I find that rather than always going to an old function, it is often best to put that code into the module under development, at least to see that it works first. In a final cleanup, I'll try to remove commented-out test lines, but some of the evolutionary traces may never be removed. With javascript or any other language, many complex approaches that lead away from older structures can simply make the code hard to follow, not useful. I've taken many programs others have written and restructured them to my own prejudices...seems to work well thus far. Within a comprehensible working framework, small changes can be effectively tested, changed, retested...ad infinitum. |
![]() |
|
| 1 user reading this topic (1 Guest and 0 Anonymous) | |
| Go to Next Page | |
| « Previous Topic · General · Next Topic » |





![]](http://z2.ifrm.com/static/1/pip_r.png)



7:28 PM Jul 11