metadata, objects, relations: similarities and differences and cognitive aspects of categorization

Download MetaData, Objects, Relations: Similarities and Differences and Cognitive Aspects of Categorization

If you can't read please download the document

Upload: pepper

Post on 07-Jan-2016

25 views

Category:

Documents


2 download

DESCRIPTION

MetaData, Objects, Relations: Similarities and Differences and Cognitive Aspects of Categorization. SIMS 202, Lecture 10 Fall, 1997 Prof. Marti Hearst. Why are we learning about metadata database design object oriented systems? How are these related to one another? - PowerPoint PPT Presentation

TRANSCRIPT

  • MetaData, Objects, Relations: Similarities and DifferencesandCognitive Aspects of CategorizationSIMS 202, Lecture 10Fall, 1997Prof. Marti Hearst

    SIMS 202, Marti Hearst

  • Today: Four Related QuestionsWhy are we learning about metadatadatabase designobject oriented systems?How are these related to one another?How are these different from one another?Why is it hard to define/design these things?What cognitive science isWhat cogsci tells us about categorization

    SIMS 202, Marti Hearst

  • Why are we learning about metadata, database design and OO systems?Information organizationThese are all ways to handle complexity, by imposing structure and order on messy dataEach is useful in a different way

    SIMS 202, Marti Hearst

  • How is the Relational Model Related to the Object Oriented Model?Lets start with a re-description of objects.Objects are instantiated classes Classes are have attributesAttribute is the TYPE of information (kind of like a data type in a programming language)Attributes have VALUES that fit their TYPEattribute TYPE: integer, VALUE: 9attribute TYPE: suit, VALUE: club, heart, spade, diamondattribute TYPE: name, VALUE: Juanita, Dekai, Laura

    SIMS 202, Marti Hearst

  • Attributes vs. ClassesHow do we make this distinction?Say we are clothing manufacturers.Fur is a classAnimal is an attributeSay we are naturalists.Animal is a classFur is an attribute

    SIMS 202, Marti Hearst

  • Garment Makers vs. NaturalistsClass FurAnimal: fox, rabbit, sableColor: red, black, whiteTexture: silky, thick, coarseGarment_type: coat, stole, hat

    Class AnimalOuter_Covering: fur, skin, scalesNumber_of_limbs: 4, 6, 8Circulatory_System: cold_blooded, hot_blooded

    SIMS 202, Marti Hearst

  • Attributes vs. ClassesThis example showed that one users classes are another users attributes.

    SIMS 202, Marti Hearst

  • Lets Revisit RelationsTable contains rows of dataThe data has attribute typesCan perform certain operations:select (pick out rows)project (pick out columns)join (match up 2 or more tables data)add (add a new row)delete (delete a row)update (change a value within a row)

    SIMS 202, Marti Hearst

  • Relations vs. ObjectsER Diagram:Entity = ClassAttribute = AttributeRelation ~ MethodRelational TableTable ~ ClassRow ~ Instantiated Object of ClassColumn = Attribute TYPEValue in (row,column) = Attribute VALUEName ~ Primary Key

    SIMS 202, Marti Hearst

  • Relations vs. ObjectsThere are no Class-specific Methods in the Relational ModelThere are general-purpose methods on all data:update (change), select, delete, add, join, projectThe Relation in the ER diagram indicates how to set up the tables so they can be easily joinedThere is no unique Object Id (Address) in the Relational Model Can only access an instantiated object by combinations of its attribute valuesNormalization can cause the object representation to be spread out across several tablesNo encapsulated data in the Relational Model

    SIMS 202, Marti Hearst

  • Garment Maker vs. Garment MakerClass FurAnimal: fox, rabbit, sableColor: red, black, whiteTexture: silky, thick, coarseGarment_type: coat, stole, hat

    Class GarmentMaterial: fur, cotton, woolColor: red, black, brown, white, blueGarment_Type: coat, stole, hat

    Problem: match color to material

    SIMS 202, Marti Hearst

  • Nesting Attributes and ClassesClass GarmentMaterial: Class FurAnimal: fox, rabbit, sableColor: red, black, whiteTexture: silky, thick, coarseClass CottonColor: red, blue, white, brown, blackThread_Count: 100, 200Garment_type: stole, coat, hat, t-shirtAttributes often must be nestedAlternative: two subclasses of Garment

    SIMS 202, Marti Hearst

  • Normalization and NestingIn the Relational Model, Normalization flattens out the NestingWhy?Normalization makes certain kinds of access more efficient, less likely to mess up updatesWhy isnt this confusing in the OO model?Key: Relational and OO used in different situations

    SIMS 202, Marti Hearst

  • Relations vs. ObjectsObjectsNomads, doing their own thing, rugged individualists, use one-at-a-timeExample: program running on a printerRelations: Packed into apartments, lots and lots of items all being lined up in one place for easy comparisonQueries: Find all X that have Y and are > ZExample: Everyones phone bill in the U.S.

    SIMS 202, Marti Hearst

  • Relations vs. ObjectsCan you have a table of objects?Can you have an object that has a table?

    SIMS 202, Marti Hearst

  • Metadata vs. ObjectsMetaData like the Dublin Core is simpleMuch like the name, attribute parts of a ClassNo methodsMetaData like AACRII is messierA bunch of rules about how to deal with the exceptionsLaw deals quite a bit with exceptionsComputer Science tries as hard as possible to abstract away or ignore exceptions

    SIMS 202, Marti Hearst

  • Why are we learning about this old library stuff?The computer science tradition is good at abstracting away details.The computer science tradition is not good at describing detail and convoluted exceptions.The library tradition can teach us something useful about how to describe complex data.Think about how these bibliographic examples can be applied to other domains (maybe a test question!!!)

    SIMS 202, Marti Hearst

  • Metadata vs. Relational ModelRelational model makes use of MetadataThe description of the database is often called a SchemaThe Schema is a kind of Metadata descriptionMain differences:Exceptions not handled well in the relational model eitherRelational model focus is on the system designMetadata focus is on the description of the data, independent of a computer system

    SIMS 202, Marti Hearst

  • Fresh Topic: Why is this Stuff Hard?These are all variations on CategorizationCategorization is an important topic in:PhilosophyLanguage/LinguisticsPsychology

    How does the human mind do categorization?

    SIMS 202, Marti Hearst

  • Whats In a Sentence?A sentence is not a verbal snapshot or movie of an event. In framing an utterance, you have to abstract away from everything you know, or can picture, about a situation, and present a schematic version which conveys the essentials. In terms of grammatical marking, there is not enough time in the speech situation for any language to allow for the marking of everything which could possibly be significant to the message.Dan Slobin, in Language Acquisition: The state of the art, 1982

    SIMS 202, Marti Hearst

  • Approximating MeaningDefining attributesA weak approximation to meanings and conceptsDefining methodsA weak approximation to how these meanings interact and changeNecessary and Sufficient ConditionsExample: A prime number is an integer divisible only by itself and 1.

    SIMS 202, Marti Hearst

  • Properties of CategorizationFamily Resemblance: Members of a category may be related to one another without all members having any properties in common that define the category.Centrality:Some members of a category may be better examples of that category than others.

    SIMS 202, Marti Hearst

  • CentralityA category: Prime NumbersDefinition: An integer divisible only by itself and 1Examples: 1, 2, 3, 5, 7, 11, 13, 17, A very clear-cut category. Or is it?Can one number be more prime than another?CENTRALITY: some members of a category may be better examples than others

    SIMS 202, Marti Hearst

  • Definition of GameFamous example by WittgensteinClassic categories: clear boundaries defined by common propertiesCounterexample: GameNo common properties shared by all gamescard games, ball games, Olympic games, childrens gamescompetition: ring-around-the-rosieskill: dice gamesluck: chessNo fixed boundary; can be extended to new gamesvideo gamesAlternative: Concepts related by Family Resemblances

    SIMS 202, Marti Hearst

  • Characteristic FeaturesPerceived degree of category membership has to do with which features define the category.Members usually do not have ALL the necessary features, but have some subset.Those members that have more of the central features are seen as more central members.People have conceptions of typical members.

    SIMS 202, Marti Hearst

  • Properties of CategorizationBasic-level Categories:Categories are organized into a hierarchy from the most general to the most specific, but the level that is most cognitively basic is in the middle of the hierarchyBasic-level Primacy:Basic-level categories are functionally primary with respect to factors including ease of cognitive processing (learning, reasoning, recognition, etc).

    SIMS 202, Marti Hearst

  • Levels of AbstractionBrown 1958, 65, Berlin et al., 1972, 73Folk biology:unique beginner: plant, animallife form: tree, bush, flowergeneric name: pine, oak, maple, elmspecific name: Ponderosa pine, white pinevarietal name: western Ponderosa pineNo overlap between levelsLevel 3 is basicLevel 3 corresponds to genus

    SIMS 202, Marti Hearst

  • Characteristics of Basic-level CategoriesLanguagePeople name things more readily at basic levelName learned earliest in childhoodLanguages have simpler names at basic levelSounds like the real name Name used more frequentlyStrange to call a dime a coin, a metal objectNames used in neutral contextTheres a dog on the porch.Theres a terrier on the porch.

    SIMS 202, Marti Hearst

  • Characteristics of Basic-level CategoriesConceptsThings perceived more wholistically at basic level (rather than by parts)No difference in how people interact with the concept between basic and more specific levelsThings are remembered more readily at basic levelFolk biology categories correspond accurately to scientific biological categories only at the basic level

    SIMS 202, Marti Hearst

  • Superordinate and Subordinate LevelsSUPERORDINATE animal furnitureBASIC LEVEL dog chairSUBORDINATE terrier rocker

    Children take longer to learn superordinateSuperordinate not associated with mental images or motor actions

    SIMS 202, Marti Hearst

  • Typicality and Characteristic FeaturesSome categories have clear boundaries, but have graded membershipWhat is a good example of a bird?Examples from language:A robin is a bird. A chicken is a bird.A bat is a bird.Takes longer for people to say the second is true and the third is falseFeatures characterize the categoryHow many typical features does the object possess?

    SIMS 202, Marti Hearst

  • Characteristic FeaturesIs a cat on a mat at cat?Is a dead cat a cat?Is a photo of a cat a cat?Is a cat with three legs a cat?Is a cat that barks a cat?Is a cat with a dogs brain a cat?Is a cat with every cell replaced by a dogs cells a cat?

    SIMS 202, Marti Hearst

  • PolysemyMost words have more than one sensethat dog has floppy earsgood ear for jazzthree ears of cornHomonymy: same word, different meaningPolysemy: different senses of same word

    SIMS 202, Marti Hearst

  • Category Structure and PolysemyCategory membership is determined by shared subsets of featuresDifferent senses of a word reflect differences in which attributes are sharedThis is reflected in language by polysemyrelated meaning, but slightly differentExample: bankthe building, the institution, the notion of where money is stored

    SIMS 202, Marti Hearst

  • MetonymyUse one aspect of something to stand for the wholeThe building stands for the institution of the bank.Newscast: The White House relased new figures today.Waitperson: The ham sandwich spilled his drink.

    SIMS 202, Marti Hearst

  • SynonymyDifferent ways of expressing related conceptsExamplescat, feline, Siamese catOverlaps with basic, subordinate levelSynonyms are almost never trueused in different contextshave different implicationsThis is a point of contention.

    SIMS 202, Marti Hearst

  • ThesauriPolysemy: same word, different senses of meaningslightly different concepts expressed similarlySynonyms: different words, related senses of meaningsdifferent ways to express similar conceptsThesauri help draw all these together

    SIMS 202, Marti Hearst

  • SummaryProcesses of categorization underlie many of the issues having to do with information organizationCategorization is messier than our computer systems would likeHuman categories have graded membership, consisting of family resemblances.Family resemblance is expressed in part by which subset of features are sharedIt is also determined by underlying understandings of the world that do not get represented in most systems

    SIMS 202, Marti Hearst