Tag: software engineering

Entries for tag "software engineering", ordered from most recent. Entry count: 31.

Warning! Some information on this page is older than 5 years now. I keep it for reference, but it probably doesn't reflect my current knowledge and beliefs.

Pages: 1 2 3 4 >

# The Virtual Reality of Code

Nov 2015

In my opinion, coding is a virtual reality with its own set of rules unlike in the physical world. In physical world, each thing has its specific location at the moment, things don't appear or disappear instantly, we have laws of energy and mass conservation. Inside a computer, we have data in memory (which is actually linear - 1D) and processors, processes and threads executing instructions of a program over time.

I have been trying for years to imagine some nice way editing or at least visualizing code, which would be more convenient (and look more attractive) than the text representation we all use. I am unsuccessful, because even if we render some nice looking depiction of a function code, its instructions, conditions and loops, drawing all the connections with variables and functions that this code uses would clutter the image too much. It's just like this virtual world doesn't fit into a 2D or 3D world.

Movie depiction of computer programs is often visually attractive, but far from being practical. This one comes from "Hackers" movie.

Of course there are ways of explaining a computer program on a diagram, e.g. entity diagrams or UML notation. I think that electrical diagrams are something in between. Electrical devices end up as physical things, but on a diagram, the shape, size and position of specific elements doesn't match how they will be arranged on a PCB. Logical representation and electrical connections between them is the only thing that matters.

Today it occured to me that this virtual reality of code also has "dimensions", in its own sense. It's evident when learning programming.

1. First, one has to understand control flow - that the processor executes subsequent instructions of the program over time and that there can be jumps caused by conditions, loops, function calls and returns. It's called dynamic aspect of the code. It can be depicted e.g. by UML sequence diagram or activity diagram.

I still remember my colleague at university who couldn't understand this. What's obvious to every programmer, was a great mystery to him as first year's computer science student. He thought that when there is a function above main(), instructions of that function are executed first and then instructions of the main function. He just couldn't imagine the concept of calling one function from the other and returning from it.

2. Then, there is so called static aspect - data structures and objects that are in memory at given moment in time. This involves understanding that objects can be created in one place and destroyed in another place and at later time, that there may be a collection of multiple objects (arrays, lists, trees, graphs and other data structures), objects may be connected to one another (the concept of pointers or references). Various relationships are possible, like one-to-one and one-to-many. In object-oriented methodologies, there is another layer of depth here, as there are classes and interfaces (with inheritance, composition etc.) and there are actual objects in memory that are instances of these classes. It can be depicted for example by entity diagram or UML class diagram.

3. Finally, one has to learn about version control systems that store history of the code. This adds another dimension to all the above, as over successive commits, developers make changes to the code - its structure and the way it works, including format of data structures it uses. Branches and merges add even more complexity to it. GUI apps for version control systems offer some way of visualizing this, whether it's showing a timeline with commits ("History") or showing who and when commited each line of a file ("Blame").

There is even more to it. Source is organized into files and directories, which may be more or less related to the structure of contained code. Multithreading (and coprocessing e.g. with GPU, SIMD, and all kinds of parallel programming) complicates imagining (and especially debugging!) control flow of the program. Program binaries and data may be distributed into multiple programs and machines that have to communicate.

It fascinates me how software is so multidimensional in its own way, while so flat at the same time (after all, computer memory is linear). I believe that becoming a programmer is all about learning how to imagine, navigate and create stuff in this virtual reality.

Comments | #philosophy #software engineering Share

# The Importance of Good Debugging Tools

May 2015

Robert L. Glass wrote an interesting article "Frequently Forgotten Fundamental Facts about Software Engineering". While I am aware he is far more experienced than me and he has sources to backup his opinions, I dare to disagree with following paragraph:

T1. Most software tool and technique improvements account for about a 5- to 30-percent increase in productivity and quality. But at one time or another, most of these improvements have been claimed by someone to have "order of magnitude" (factor of 10) benefits. Hype is the plague on the house of software.

I think that having good tools is one of the most important (and sometimes underrated) factors in programmers' efficiency. From comfortable chair, through fast desktop PC, big monitors, fast Internet connection, to good software, including IDE (editor, compiler, debugger etc.) and auxiliary applications (like Total Commander for file management) - they can all make big difference in how developers feel about their job and how fast the work is going.

Of course, "tools" is a broad term. If by choosing good tools we mean changing used programming language or libraries in the middle of the project, then sure it is usually a bad idea. Writing a script to automate some task that can be done manually in just few minutes or even an hour is usually not worth doing as well.

But what is always worth investing time and money in (either buying or developing your own, learning to use them) are tools that help with debugging. As debugging is the hardest and most time consuming part of programming, any improvement in that process can make as big difference as between minutes and weeks - often in the most critical time, close to the deadline.

So in my opinion, being able to interactively debug executing program (or its trace) - to setup a breakpoint and preview current value of variables (as opposed to relying only on some debug text prints, analyzing logs or some command-line tools) is an absolute minimum to be able to call any programming environment reasonable, mature and eligible to write any serious software in it. If you are the one who writes a program and you have to treat that program as a black box while it is running, without a possibility to peek inside, then something is wrong. AFAIK that is the case with some technologies that deploy program to a server or an embedded device.

How is it in graphics programming? John Carmack once tweeted:

gl_FragColor.x = 1.0; is the printf of graphics debugging. Stone knives and bearskins.

While I share his frustration with such primitive methods of debugging, the reality of today's GPU programming is not all that bad as we could only render red pixels to see any debug information. NVIDIA Nsight offers debugging of a GPU, and so does Intel Graphics Performance Analyzers (GPA). Finally, there is a vendor-agnostic debugger for DirectX developed by Microsoft. Originating from Xbox (that is where its name comes from - Performance Investigator for Xbox), PIX used to be available for free as a standalone application. Recently, they have integrated it as part of Visual Studio called Graphics Diagnostics. It was available in commercial versions of Visual Studio only and not in free Express edition (very bad news for all amateur DirectX game developers), but finally they shipped it for free together with the new, fully-functional Visual Studio Community edition.

With these tools, there is no explanation for whining "nothing renders and I don't know why" - just go debug it! :) The future also looks bright. DirectX 12 will offer Debug layer. Vulkan also claims to support debugging tools and layers and LunarG company even started working on such tool - GLAVE.

Comments | #debugger #software engineering Share

# Too Low Level and Too High Level Abstraction

Jul 2011

Programmers fascinated by object-oriented methodology love the idea of abstraction and generalization, but others - especially those interested in data-oriented design - consider "premature generalization" as a big anti-pattern. I prefer the latter approach, so as I recently learn a little bit of Java, including Servlers and JSP, I feel terrified of how its API looks like. Let me explain...

Today an idea came to my mind that an API can be too abstract (general, universal) at low level or high level. The most abstract low-level API looks like this:

int write(void *buf, int bytes);
int read(void *buf, int bytes);

It's so general and universal that it would fit anywhere, but on the other hand it doesn't provide anything specific. It's just a very narrow pipe we have to pass all our data through. Of course this method is sometimes useful. For example that's the way we transfer data over network. But we do it using some specific protocol so it's reasonable to define a higher-level API with some objects that encapsulate concepts specific to that protocol, just like cURL implements HTTP, FTP and other network protocols on top of network sockets.

Let's compare some details from two APIs. A socket can have additional options that can be set. There are different options for different kinds of sockets and they have different types - some of them are bools, others are ints. But there is only one function to set such options - setsockopt:

int setsockopt(SOCKET s, int level, int optname, const char *optval, int optlen);

Objects in POSIX Threads API also can have additional attributes, but there are specific functions to set them, like in the example below. Which way do you prefer?

int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate);
int pthread_attr_setstacksize(pthread_attr_t *attr , size_t stacksize);

But excessive abstraction can also go too far the other way. That's what I see in Java API. I mean, these who code a lot in this language certainly can "feel" the sense in this approach, but for me it's almost ridiculous. Not only every part of the Java API has dozens of small classes with not much more than simple getName/setName methods and there is an interface for everything just like people were afraid of using real classes, but lots os stuff is refered by strings. I'd say Java is so high level that it's not only a strongly-typed language, it's stringly-typed language. Lots of classes implementing some interface can be registered in some global registry under its name so an instance can be constructed without ever refering to the real class, like: new SomeSpecificClass().

Probably the most grotesque example I saw today is java.naming package. Including its subpackages, it contains about hunred of classes and interfaces. But there is more than this. It's the whole great body of knowledge called JNDI (Java Naming and Directory Interface) with long tutorials and lots of concepts to understand, like Context, InitialContext, Directory, Service Provider, Lookup, Names, Composite Names, Compound Names, Naming System, Federations and so on... All this just to provide an abstract interface for a tree of any objects referred by name, so that disk files and directories can be accessed same way as data received with LDAP or some global variables defined in an XML file. Do anyone really needs that? The javax.naming.Context interface is not far from what would be the ultimate high level abstraction:

interface SuperAbstractEverything {
  public Object getObject(String namePathIdOrWhatever);

Comments | #java #software engineering Share

# My Talk in Kraków - Slides

Mar 2011

I just went back from Kraków where I've given a talk at AGH University of Science and Technology (Akademia Górniczo-Hutnicza), at SKN Shader scientific group. The title of my presentation was "Pułapki programowania obiektowego" ("Pitfalls of Object-Oriented Programming"). Here are the slides: [PDF], [PPTX]. The presentation is new, although you could already see most of the slides at IGK conference or at Polygon scientific group.

By the way, I want to thank Koshmaar, TeMPOraL and all members of the SKN Shader for the wonderful time in Kraków. You can see several photos that I took there in my gallery on Picasaweb and a panorama photo on Panogio.

Comments | #events #software engineering #teaching Share

# March 29th, Kraków - my Next Presentation

Mar 2011

Tomorrow I go to Kraków to give a talk at AGH University of Science and Technology (Akademia Górniczo-Hutnicza). I've been invited by my friends from SKN Shader scientific group (greetings Koshmaar and TeMPOraL :)

The title of my presentation will be "Pułapki programowania obiektowego" ("Pitfalls of Object-Oriented Programming"). The event will take place at 18:30, building B1, room H24. More information is available in this entry on SKN Shader website, as well as Facebook event.

Comments | #events #teaching #c++ #software engineering Share

# Static C++ Code Analysis with PVS-Studio

Mar 2011

By the courtesy of its authors, I have a chance to evaluate PVS-Studio - a static code analyzer for C, C++ and C++0x. This commercial application is installed as a plugin in Visual Studio 2005/2008/2010. Fortunately I have Visual Studio 2008 Professional at home so I could try it with the code of my personal projects. PVS-Studio differs from other tools of this kind, like free Cppcheck, by finding three types of errors or warnings: general, related to OpenMP and 64-bit portability issues.

After opening my solution in Visual Studio, I choose a command from the special menu to analyze all the code.

A progressbar appears while PVS-Studio does the computations, utilizing almost 100% of all 4 CPU cores. Finally, a dockable panel appears with a list of found issues.

The general category warns about exact float comparison with == and stuff like that. It managed to find few places where I forgot the "&" character while passing a vector as const refefence parameter, rightly telling that it will cause "decreased performance". But its greatest find in my CommonLib library code was this unbelievable bug:

Some messages look funny. Should I code some general, abstract, portable, object-oriented, Alexandrescu-style template-based solution here just to avoid copying some code into several similar instructions? :)

I didn't check how the OpenMP validation works because I don't currently use this extension. As for 64-bit compatibility issues, I have lots of them - just because my code is not prepared to be compiled as 64-bit. PVS-Studio seem to do a good job pointing to places where fixed-length 32-bit integers are mixed with pointers, array indexing etc.

Overall, PVS-Studio looks like a good tool for C++ programmers who care about the quality of their code. Finding issues related to OpenMP and 64-bit compatibility can be something of a great value, if only you need such features.

Too bad that PVS-Studio, opposite to Cppcheck, is a Visual Studio plugin, not a standalone application, so it obviously requires you to have a commercial MSVS version and do not work with Express edition. But this is understandable - if you need OpenMP or 64-bit, you probably already use Visual Studio Professional or higher.

PVS-Studio analyzes C, C++ and C++0x. It doesn't work with C++/CLI language, but that's not a big flaw too. I use C++/CLI at work, but I can see it's quite unpopular, niche language. Its compilation or analysis would also be very difficult because it mixes all features from both native C++ and .NET. Even Microsoft didn't find resources to implement IntelliSense for C++/CLI in Visual Studio 2010.

Comments | #software engineering #c++ #tools #visual studio Share

# Data-Oriented Design - Links and Thoughts

Jan 2011

In April 2008 I've written an essay "Fanatyzm obiektowy" (in Polish, it means "Object-Oriented Fanaticism"). I've always believed there is something wrong with object-oriented programming, that it simply doesn't meet its own objectives and so following it blindly as an ideology not only a programming language mechanics has many pitfalls. Now I'm glad that recently a concept of "Data-Oriented Design" (DOD) emerged and gained popularity among game developers. Here is my try to aggregate all important information on this subject that can be found on the Internet:


Blog entries:


If you know any other good readings on this subject, please leave a comment. I'll update my list.

As far as I can see, focusing more on data instead of objects gives a number of benefits for the code:

Of course DOD doesn't exist in the void. It's related to many other concepts and you can find many good sources of knowledge about each one. Some of them are:

Comments | #software engineering #c++ Share

# Different Ways of Processing Data

Dec 2010

Much is being said about UX these days, while not so much about the art and science of designing good API so libraries can communicate successfully with their users, that is other programmers. I'm interested in the latter for some time and today I'd like to explore the topic of how some data, whether objects or a raw sequence of bytes, can be processed, loaded and saved. I will clarify what I mean in just a moment. I believe the ways to do it can be grouped into several categories, from simplest but most limited way to the most flexible, efficient but difficult to use.

1. The simplest possible interface for loading (or saving) some data is to pass a string with path to file. That's the way we load DLL libraries in WinAPI. Unfortunately it limits the programmer to load object only from physical files, not from other places like memory pointer, where source data could be placed e.g. after decompression or downloading from network.

HMODULE WINAPI LoadLibrary(__in LPCTSTR lpFileName);

2. Solution to this problem is an API that allows loading data from either file or memory. Some example can be texture loading in D3DX (extension to DirectX), where separate functions are available that take either path to a disk file (LPCTSTR pSrcFile), pointer to a buffer in memory (LPCVOID pSrcData, UINT SrcDataSize) or Windows resource identifier (HMODULE hSrcModule, LPCTSTR pSrcResource).

HRESULT D3DXCreateTextureFromFile(
  __in LPDIRECT3DDEVICE9 pDevice,
  __in LPCTSTR pSrcFile,
  __out LPDIRECT3DTEXTURE9 *ppTexture);
HRESULT D3DXCreateTextureFromFileInMemory(
  __in LPDIRECT3DDEVICE9 pDevice,
  __in LPCVOID pSrcData,
  __in UINT SrcDataSize,
  __out LPDIRECT3DTEXTURE9 *ppTexture);
HRESULT D3DXCreateTextureFromResource(
  __in LPDIRECT3DDEVICE9 pDevice,
  __in HMODULE hSrcModule,
  __in LPCTSTR pSrcResource,
  __out LPDIRECT3DTEXTURE9 *ppTexture);

Another possible approach is to utilize single function and interpret given pointer as either string with file path or a direct memory buffer, depending on some flags. That's the way you can load sound samples in FMOD library:

FMOD_RESULT System::createSound(
  const char * name_or_data,
  FMOD_MODE mode,
  FMOD::Sound ** sound

Where name_or_data is "Name of the file or URL to open, or a pointer to a preloaded sound memory block if FMOD_OPENMEMORY/FMOD_OPENMEMORY_POINT is used."

3. That's more flexible, but sometimes an object is so big that it's not efficient or even possible to load/uncompress/download its full contents into memory before creating a real resource or do some processing. What's needed is an interface to process smaller chunks of data at time. One of ways to do it is defining an interface with callbacks that the library will call to query for additional piece of data. Then we can implement this interface to read data from any source we wish, whether simple disk file, compressed archive or a network socket. When we want to load an object, we call appropriate function passing pointer to our implementation of the interface. During this call our code is called back and asked to read data. For example, that's the way we can load sounds in Audiere library. The interface for reading data is:

class File : public RefCounted {
  ADR_METHOD(int) read(void* buffer, int size) = 0;
  ADR_METHOD(bool) seek(int position, SeekMode mode) = 0;
  ADR_METHOD(int) tell() = 0;

4. A step futher towards more flexibility and generality is the concept of streams, like from Java, C# or Delphi. These object-oriented languages define in their standard libraries an abstract base class for input (for reading data) and output stream (for writing data) that can be implemented in many possible ways. For example, Java's InputStream class defines methods:

void close()
int read()
int read(byte[] b)
int read(byte[] b, int off, int len)
void reset()
long skip(long n)

Many derived classes are provided. Some of them read/write data from real sources like file or network connection, while others process data (for example compress, encrypt) and pass them to another stream. This way a chain of responsibilities can be created where we write data to a stream that compresses them and pass them to another stream, which encrypts them and passes them to the one that does buffering and finally write the data to the stream that saves it to a file. It's my favourite approach right now, although it has a drawback - an overhead for virtual method calls in each stream for each piece of data read or written. This inefficiency can be minimized by controlling granularity - processing a buffer of reasonable size at time, never byte-after-byte.

5. Finally, there is the most direct, low-level approach which is also most flexible and efficient, but at the same time very difficult to use properly. I'm talking about a single function that takes pointers to input and output buffers, as well as some structure containing current state and processes a piece of data. It consumes some/all data from input buffer (by advancing some pointer or counter) and produces new data to the output buffer. There are no callbacks. The interface is neither "push" (where we write data) or "pull" (where we read data), but both at time. That's the way zlib compression library works (which I complained about here, in Polish), as well as LZMA SDK (which I described here).

typedef struct z_stream_s {
  Bytef  *next_in;  /* next input byte */
  uInt   avail_in;  /* number of bytes available at next_in */
  uLong  total_in;  /* total nb of input bytes read so far */

  Bytef  *next_out; /* next output byte should be put there */
  uInt   avail_out; /* remaining free space at next_out */
  uLong  total_out; /* total nb of bytes output so far */

  char   *msg;    /* last error message, NULL if no error */
  struct internal_state FAR *state; /* not visible by applications */
} z_stream;

ZEXTERN int ZEXPORT deflate OF((z_streamp strm, int flush));

Comments | #software engineering #c++ Share

Pages: 1 2 3 4 >

[Stat] [STAT NO AD] [Download] [Dropbox] [pub] [Mirror]
Copyright © 2004-2017