In many practical situations, it is necessary to describe an image in words. From the purely logical viewpoint, to describe the same object, we can use concepts of different levels of abstraction: e.g., when the image includes a dog, we can say that it is a dog, or that it is a mammal, or that it is a German Shepherd. In such situations, humans usually select a concept which, to them, in the most natural; this concept is called the basic level concept. However, the notion of a basic level concept is difficult to describe in precise terms; as a result, computer systems for image analysis are not very good in selecting concepts of basic level. At first glance, since the question is how to describe human decisions, we should use notions from a (well-developed) decision theory -- such as the notion of utility. However, in practice, a well-founded utility-based approach to selecting basic level concepts is not as efficient as a heuristic "similarity" approach. In this paper, we explain this seeming contradiction by showing that the similarity approach can be actually explained in utility terms -- if we use a more accurate description of the utility of different alternatives.