Build A Personal Knowledge Store With Topic Modeling In Contextualize - Episode 267

Summary

Our thought patterns are rarely linear or hierarchical, instead following threads of related topics in unpredictable directions. Topic modeling is an approach to knowledge management which allows for forming a graph of associations to make capturing and organizing your thoughts more natural. In this episode Brett Kromkamp shares his work on the Contextualize project and how you can use it for building your own topic models. He explains why he wrote a new topic modeling engine, how it is architected, and how it compares to other systems for organizing information. Once you are done listening you can take Contextualize for a test run for free with his hosted instance.

Do you want to try out some of the tools and applications that you heard about on Podcast.__init__? Do you have a side project that you want to share with the world? With Linode’s managed Kubernetes platform it’s now even easier to get started with the latest in cloud technologies. With the combined power of the leading container orchestrator and the speed and reliability of Linode’s object storage, node balancers, block storage, and dedicated CPU or GPU instances, you’ve got everything you need to scale up. Go to pythonpodcast.com/linode today and get a $60 credit to launch a new cluster, run a server, upload some data, or… And don’t forget to thank them for being a long time supporter of Podcast.__init__!



Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Brett Kromkamp about Contextualise, a topic modeling application that helps you build a mind map for information-heavy projects

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Contextualize is and some of the types of projects that it can be used for?
    • What was your motivation for creating it?
  • How do you use topic maps in your own work and creative endeavors?
  • The space of personal note-taking and knowledge management is vast and varied. What does Contextualize do well that you have been unable to find or implement in other tools?
  • For someone using Contextualize, what does that workflow look like?
  • How are you approaching integration with different creative contexts (e.g. text editors, graphics editors, word processing, etc.)?
  • Can you describe how Contextualize is implemented?
    • How has the design evolved since you first began working on it?
  • In the documentation for Contextualize it mentions that this is the latest in a string of topic mapping platforms that you have built. What are some of the lessons that you have learned from previous efforts that have influenced the design of this one?
  • One of the challenges with many knowledge management tools is that they are proscriptive in how to work with them. In what ways has your own preference for how to interact with information influenced the direction of Contextualize?
    • Being an open source application, how has its exposure to the public directed your software and user design?
  • How do you approach the challenge of reducing friction in adding content and relations while allowing for flexibility and context management?
  • What are some of the projects that you are using Contextualize for?
  • What are your thoughts on the utility of something like Contextualize for capturing and organizing the collective knowledge of a team of collaborators, whether in a work or casual context?
  • What have you found to be the most interesting, complex, or complicated aspects of building a topic mapping platform?
  • When is Contextualize the wrong choice?
  • What do you have planned for the future of the project?

Keep In Touch

Picks

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Click here to read the raw transcript...
Tobias Macey
0:00:13
Hello, and welcome to podcast dotnet, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project to hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at linode. With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested linode platform including simple pricing, load balancers, 40 gigabit networking dedicated CPU and GPU instances s3 compatible object storage and worldwide data centers. Go to Python podcast.com slash linode. That's l i n o d today and get a $60 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show. Your hostess usual is Tobias Macey and today I'm interviewing Brett Kromkamp about contextualize a topic modeling application that helps you build a mind map for information, heavy projects. So Brett, can you start by introducing yourself?
Brett Kromkamp
0:01:11
Yes. As I mention, my name is Brett, I'm living in Northern Norway. I'm working for a local software company. We are in the educational sector. I've been building topic map based systems for for over 15 years, so much so that I'm building them now in my private life as well.
Tobias Macey
0:01:32
And do you remember how you first got introduced to Python?
Brett Kromkamp
0:01:35
Yes, I was thinking about that. I think if I remember correctly at some point, and I mean, this must have been 2020 years ago. 1820 years ago, I bought a CD ROM when you would still buy things like a CD ROM I bought a CD ROM and it contained a lot of niche languages. I think it included things like RX a language from IBM and x Lisp, some kind of options. oriented Lisp. And one of the languages was also Python. So a very, very early version of Python. So I took a look at the different languages on the CD ROM. And I took a look at Python. And it was, yep, this is quite interesting. But I did actually forget it at that point. And I moved on to other things. I think I went into Java programming at the time. But at some point in my in one of my jobs, I had to actually give lessons to to people programming lessons. So then Python came back to me as this well, incredibly simple looking language that would allow people that are learning to program to really focus on learning to program and not all of the ceremony around programming languages. So so I picked up Python, I started teaching people how to program with Python. And slowly but surely, I personally started using Python more and more from from that point onwards.
Tobias Macey
0:02:51
And you mentioned that you've been working on Topic Maps for quite some time now wondering if you can give a bit of context about what a topic map is. is a bit about how you first got involved in working on them.
Brett Kromkamp
0:03:03
Yeah, sure. Topic Maps that they were part of I think this was roundabout 2006 2007 there was this whole Semantic Web and semantic web technologies that became a real thing at that point and Topic Maps was was one of those technologies within the semantic web space. So Topic Maps as as the name implies, it is really about connecting creating topics. And a topic can be anything, any abstract concept, anything can be a topic. So you create your set of topics and then you start relating those topics. But Topic Maps also was an ISO standard or is an ISO standard and to degree topic, maps, never were able to jump in intrapreneurs speak they were never able to jump the chasms, so they fell somewhat out of out of favor. And now if you talk about semantic techniques, Use, you're normally talking about RDF, which is an alternative semantic technology, but Topic Maps in themselves. I mean, it is such a powerful paradigm. And it's such a powerful way of thinking about so many things. And it allows you to marble. So many things I was say Topic Maps are a metamodel. They allow you to model other models. So using Topic Maps, I started using Topic Maps in the company I was working for at the time, which was a quite large company in Spain in the tourism industry. And we started using Topic Maps to to manage the numerous websites we had, we had a lot of products targeting a lot of very, very specific niche holiday markets. So we built a topic map CMS to actually manage all of these different smaller micro sites, and we and Topic Maps worked absolutely brilliantly in that aspect. So that's how I came to start Using Topic Maps professionally and then and then I came to Norway about eight years ago. And in Norway, they've always used Topic Maps in the educational sector. And they use it literally to to manage at a at a state level, they use it to manage to organize the national curriculum. So that's what we currently doing. We build systems that allow at a national level to make available services using using Topic Maps or educational services using topic map. And then personally I've just I fell in love with the Topic Maps with the Topic Maps model that Topic Maps paradigm. And and just, I think and Topic Maps I model in topic map. So just personally, I use it for almost all of my own personal projects.
Tobias Macey
0:05:47
And that brings us to contextualize which is your latest incarnation of a platform for being able to actually create and manage and organize these Topic Maps. So can you give a bit of a description about what it is and some The types of projects that you're using it for and your motivation for creating this iteration of it.
Brett Kromkamp
0:06:04
Okay, so yes, I have built several topic map systems I bought the first one in 2007. Personally, I bought that one in 2007. The problem with that topic map and in general with topic map and knowledge management applications in general is is the complexity especially the complexity of the underlying model it leaks through into the user interface and to a degree that is inevitable but it's, it's it makes the application very difficult to use and, and by extension, also very niche people. People just don't have the time they don't have the interest to try to learn extremely complex application. So what I've been trying to do and with contextualize I am on that road is trying to use something like Topic Maps and the Topic Maps paradigm but but at the same time, have the simplest Possible user interface, and contextualizes is an iteration on that road. So trying to make it very simple a user accesses logs into contextualize they create a topic map. The creation of a topic map is a very simple thing. You provide a title, a description, maybe an image. That's it. Now you have a topic map. The topic map is pre populated with some some core topics, some critical topics. And then after that, one of the first things you would do is create a topic and the creation of a topic is again, as simple as possible. You could leave it at just adding a title, you add a title, the application will automatically generate the identifier for you that topic identifier for you, you potentially could add some topic text, and then you just click on Save. And now you have a topic and again, a topic can be anything. I'm a topic my project, which is a topic the company I work for is a topic everything that you can think of any concept you can think Have can actually be a topic. Once you've set up a topic, then you start thinking about the relationships between your topics. The relationships, and this is an interesting thing about topics because in many respects when it comes to graph databases, it's not so much the entities, although they obviously are valuable, but it's the relationships that edges between the vertices that are what is where the power of a graph database comes into play. So in a topic map, the relationships called associations and topic map terms, they are actually like everything and Topic Maps, they are semantically meaningful. So you're not only asserting a relationship between two topics, between for example, me and the company I work for. I'm also asserting the type of relationship so I can say this is of type employment or work and on top of that each topic plays a role in a relationship. So I play the role of employee my employer obviously plays the company. workforce, they play the role of employer. All of these nouns that I've just mentioned, employer, employee work, all of these, they are topics as well. So you eventually end up with a completely self contained self referential set of, yeah, a complete data model that is self referential, and everything you're doing that has a semantic meaning meaning what else could I say about Topic Maps, I mean, again, topics in themselves have a type. So if you create a topic it is of type topic, that type is a topic when you create an association the association has a type of By default, the type of the association is Association, the roles the occurrences, occurrences are what allow you to connect to a topic information resources. So information resources can be anything absolutely anything from a link to a page to a video to PDF to an Excel spreadsheet, anything you want, you can connect to a topic and the currency is what establishes the connection between a topic and the actual information resource itself. These occurrences in themselves are also typed. And again, like everything in Topic Maps, that type is a topic. So it just is a very powerful way to structure and organize what is potentially a complex domain or maybe not such a complex domain. But you have this very powerful model at your at your disposal to do that organizing that structuring of information and turning information into something that becomes more knowledge as opposed to just information
Tobias Macey
0:10:42
from my reading of the documentation and some of the blog posts that you've created around contextualize. I understand that one of the ways that you're actually using this platform is for being able to do world building in terms of creating a fictional world and some of the backstory and context characters in it. And I'm wondering if you can give a bit of background on some of the types of activities that go into that. And the ways that a topic map helps you there. And just some of the motivation for building out contextualize in terms of what were the limitations and other similar platforms that didn't fit your particular use case or mental model for being able to actually pursue this creative endeavor.
Brett Kromkamp
0:11:24
Okay, so yes, I do use contextualize for my own project. So I use contextualize also to help me actually organize and structure that contextualized project. So my application contextualized for me, obviously, it's it's a personal project it there's a lot going into that project features, ongoing issues, and of course, the actual issues I can manage on something like GitHub, but conceptually when I'm thinking about contextualized, I have I've organized that in a topic map. I will also use like you mentioned, I also use another Another hobby of mine is is world building, specifically for games and for books. So world building is just, I believe, a fantastic example where these kinds of systems not necessarily just contextualized but just in general, these kinds of systems are really really powerful because Okay, when you start focusing on building a world I mean that that is potentially vast so you have you have countries you have races you have characters you have you have the economy to think about you have the geography to think about and trying to keep this all straight. I mean, a an application like contextualized just suits that's that kind of project very well because I mean, every character in your world, while a character is a topic, the relationship between the characters is a topic the quest sorry, you can establish the relationships between these characters. The question quests that they need to go on. Those are topics the events that take place in these quests, those are topics and you can establish temporal or spatial relationships between all of this. So, for world building a topic map or a yes a topic map based application, but in general over the last couple of years, there has been a huge growth in applications for personal knowledge management and the majority of which are based on graph data models. So, these kinds of applications are just extremely, extremely suitable for for these kind of quite complex, large information, sets of information that that you need to need to manage. Also, another thing with something like Topic Maps is that even the relationships that you are asserting between two topics, even the relationships themselves, the associations, they are topics in themselves. So if, for example, if we go back to the relationship between me as an employee, and my company, the employer, there's a relationship there, one of employment but that relationship in itself, there's a lot to say about that relationship. There's a contract between an employment contract between us there is dates, there are lots of things that are relevant with regards to that very specific relationship. With Topic Maps, you can express all of that directly on the relationship. So relationships associations are top level addressable items, just like everything else. So again, to kind of like summarize these the topic map way of thinking and the topic map way of modeling it, it's just ideal for you to make sense of these kind of large sets of information. And then on top of that, because once you have created the topics and you have asserted your relationships between these topics, you can then obviously navigate them These topics, you can either do that in a very textual way by means of links. Or you can do that even with visualization with graph visualizations, so that you can actually visually see from where you are currently in your topic map that current topic, what is related 12345 steps away from your current, your current topic. So it's it's just a very powerful way of dealing with large amounts of information. What was the
Tobias Macey
0:15:28
other question? The other question was just what are some of the things that you found lacking in the other tools that are available for knowledge management and topic mapping that led you down the path of actually creating your own system and some of the specifics of contextualize that might lead someone to choose that over the other options?
Brett Kromkamp
0:15:46
Okay, so so that is a good question, actually. Because Currently, the knowledge management space is is going through quite a lot of changes and there's an application that is becoming very popular and I I understand why it's becoming very popular. It's called it's an application. I think the application is called Rome as an r. m. And it's from a company called Rome research. And one of the big things one of the big features in this application is what they're calling backlinks. So when you have a document and you create a link to another document, so say you have a document a and you create a link to document B, currently on the web links go in one direction, you actually don't see once you've navigated to the other page, you don't see the link back to where you came from. There is no link backwards. So So they've introduced this concept of backlinks where if I connect document a to document B, when I'm in document B, I can see a summary of and I can see all of the links that backlinks to the current document and and this is a very powerful way of again, just managing and organizing and navigating your information. So this is this is something that That is quite recent, this kind of feature in personal knowledge management systems. But Topic Maps have had that from day one, they've had associations associations are have always been two way. So when I create Association Association, I have to do a couple of things. First of all, I have to say, Okay, what type of relationship Am I establishing? So this semantically, this relationship has now become semantically meaningful, I'm establishing that type on that relationship. But more importantly, or just as importantly, I have to also say, okay, the topic I'm in now the current topic, the topic that I'm going to, I'm going to connect it to another topic. So what role does this other topic play within the context of this relationship? So again, if we go back to the employee employer relationship, I play the role of employee my employer, obviously the company I'm working for obviously is playing the role of employee So I have to I have to define on that relationship, I have to define those roles. So when I'm in a current topic, I can see all of the related topics, I can see what is directly related to this current topic, the one that I'm in currently, I can see, not only can I see what other topics are related to this current topic, I can also see the type of relationship that has been established. And I can also see the role of that other topic. And again, this just gives you so much context. When you in a specific topic, you have all of that context of what's happening, what is around this topic. And that context is something that I was missing in quite a few knowledge management applications up until quite recently. So that's one of the reasons why I built and have always liked building topic map applications because they've had this kind of two way back linking thinking in place. Since day one, basically.
Tobias Macey
0:19:01
Yeah. And so digging deeper into contextualize itself, can you describe a bit about how its implemented and some of the ways that it has evolved in terms of the system architecture and design since you began working on it?
Brett Kromkamp
0:19:15
Okay. So yes, broadly speaking, contextualized is made up of two, two components. The the back end back end is what I call it, the back end back end. That is what's called a topic map engine. That is another I've implemented that engine. It's also available and open source on GitHub. So that that engine is called topic dB. And that is the engine that really is what is providing all of the magic to contextualize so once you've connected to this engine, so when contextualized obviously starts up, it connects, it makes a connection to the back end engine topic map engine. From that point onwards, you can start creating the actual Topic Maps and in this engine, it has obviously an eight API that allows you to create topics to create associations to create occurrences, occurrences are what Connect information resources to two topics to retrieve them to. So I can for a specific topic, I can say, get me all of the related topics for this topic. And then I pass in that topic identify, I can also say, Get me the network of topics. So that's not just the direct directly related topics, I can actually span out and go and get up to four or five hops away five associations away, I can get all of those topics. So this engine provides all of that functionality. Then the other main part of contextualize is the actual flask application itself, which in many respects is is a very normal straightforward flask application. And it's providing that the web front end that talks to the back end topic, map topic map engine. So again, just these two parts, the topic map engine, the one that And the actual web application on the other. The topic map engine is in itself is there's nothing web about it, it could be a desktop application talking to it, it could just be an API with a set of endpoints talking to it. There's nothing web ish about the actual topic map engine. So flask is what is ready, putting desktop app engine on to the web.
Tobias Macey
0:21:21
And one of the challenges that I've often found in terms of being able to get into the habit of using a particular application for tracking notes and ideas is the availability of it across multiple different platforms and contexts where, for instance, I've used org mode for a little while in Emacs. And that's been great because it's very flexible and powerful. But then as soon as you try to use that on mobile, it all falls apart because they're not very many good clients for it, and even the ones that are decent in their own right lack a lot of the explicit power of org mode or they're just cumbersome to use. And then there are proprietary systems. That are pleasant to use, but then you're locked into their particular platform. One of the ones that stands out in terms of recent memory is notion because of its flexibility. But then if you really want to be able to use it in other contexts, then it's difficult. And so I'm wondering what your thoughts are on the overall benefits and challenges of why it's so hard to make a tool accessible in those different contexts. And some of your thoughts on providing integration points in contextualized, for being able to work across those different environments.
Brett Kromkamp
0:22:32
So obviously contextualized with it being a web application, and I've used a front end framework, Twitter Bootstrap, which obviously supports and enables this responsive nature of mobile web applications. So so using something like contextualized on for example, on an iPad, it's it's totally doable also. So in that respect, and contextualize is is something that you can use on your desktop machine or on a tablet where it would become more difficult to use something like contextualize is obviously on a smaller screen, although you can use contextualize on the smaller screen and it works quite well on a smaller screen. I think it's also a different mode. When you've got a smaller screen, you probably are not going to be in the mode of creating a complex taxonomy or ontology or doing the relationships establishing the relationships between topics. So sorry, contextualized has basically two modes, two ways of working one is the normal mode where you are where you're working with your topics and creating the relationships between your topics and and attaching information to your topics. The other one is purely a note taking mode so that you can switch over to just switch over to this note taking mode and then you just start writing your notes, but you are not putting them into the contract. text on anything, you are just recording notes, you're just writing your notes. These notes are what I call them and attach they are not they are just floating in space, they not connected to anything. But that's the point you, for example would be at a conference and you just want to record something that you just want to take a note on something that's been said, or a person that you that you're interested in a certain subject, yes, just recorded, just get it into into contextualized. And then at a later stage, you would then see your list of notes that you that you made at that point in time. And that's when you can either do one or two things, you could either convert the note into a topic once you've got those notes, you would then subsequently convert a note into a topic which you then would then you would then start relating and connecting to your other topics, or you attach that note to an already pre existing topic. So again, just two different ways of using contextualized depending on the circumstances and I think honestly small device, you're not really going to be wanting to use something like contextualized. Although you can, you're not really going to be using it would be difficult to use it to do complex organizing of your of your topics, I don't think would be a very good experience. And then obviously at some point, I've been thinking about it at some point, maybe there should be a an actual native application. And again, like I said, the topic map engine in itself, there's nothing Webby about it. It is just a type of map engine. There is the beginnings of a web API or RESTful API to actually also talk to this topic map engine. And then I would have a native application, obviously talking to the restful the REST API, but I'm somewhat reluctant to go down that road, I must admit, because building native applications Well, yeah, it's a lot of time will be spent on building a native application. And yeah, I'm reluctant to go down that road. And I'm trying to ensure that topic, sorry that contextualize is usable across as many screen sizes as possible. And then with these different modes, well, you probably on a small device are going to be more note taking, as opposed to rarely organizing and structuring your your knowledge or specific knowledge domain. At least that's the thinking,
Tobias Macey
0:26:26
yeah, it's definitely easy to spread yourself too thin on a particular project and then end up just losing steam on it and sidelining it in favor of something else. So I can appreciate your reticence to go down the path of building a dedicated mobile app just for this particular use case when you already have something that suits the context and the device form factor well enough for the time being.
Brett Kromkamp
0:26:47
Yes. Rarely, I mean, but something like a personal project like contextualized I mean, there's only so many hours in the day. So you really have to ensure that you are focusing on what in this case what I consider To be where I'll get more bang for my buck. So where users will actually benefit the most. So, yes, yes. A mobile and native moto mobile application? Who knows maybe one day?
Tobias Macey
0:27:13
Yes, yeah. And given that contextualize is a successor to other topic management systems that you've built in the past, what are some of the useful lessons that you've been able to draw on from those previous experiences that have, that you've taken on and contextualize to either things that you've done right, or things that needed improvement that you're using this as the opportunity to get correct in this incarnation?
Brett Kromkamp
0:27:41
Yeah, two things specifically. One One was simplify, simplify, simplify as much as possible, and specifically in the actual UX and the UX UI side of things. Again, I mean, there are quite a few really, really powerful knowledge management systems out there, but to be able to use them, you really do need to be an expert. With regards to these kind of applications, and I want this application to be useful to as many people as possible, I built it for myself, but it would still be a very nice thing if other people found it useful. So in order to accomplish that, I really did see that I needed to simplify as much as possible. I mentioned that we started to at least I professionally started using Topic Maps building Topic Maps within the context of a company we built a CMS around Topic Maps. And and I saw the difficulties that people had with topic map systems. Where, where if you were using complex terms, or if you had a GUI or UX that was just a bit too complex, I mean, while eventually the system fails, we had to do quite a lot of refactoring between what we initially built and which we thought was very straightforward. And what we actually ended up with about two to three late years later with the users with something that they finally said, this is good enough for us. There was there was a lot of refactoring that we had to do. We also had to introduce things like tagging. So tagging within a topic map system, you actually what you're doing is you creating, you're creating a relationship between multiple topics. But from the user's point of view, they don't see that they are not aware of that they are just tagging. So you create a tag topic, and you create an association between that tag topic and the topic that's actually been tagged. So under the hood, it's doing all of the complex, it's dealing with all of the complexity, but for the user, it's just tagging. So with those kinds of things, we saw, hey, we can be successful. So that lesson is something that I've really tried to always take into account when building later and later or later versions old Topic Maps. systems is simplify, simplify, simplify as much as possible until you can't at some point, things are as complex as they are, but still try to simplify as much as possible on the UI UX side of things. The second lesson is, I think being it's about practicality. being practical as much as much as possible, but sometimes practicality literally practicality beats purity. So, at some point, I was saying to myself when I started building contextualize, Hey, I know a lot of people have struggled with this concept of hypergraph is when you establish, you use one association to establish a relationship between more than two topics. So people naturally understand a relationship between two topics. So yes, again, between a mother and her child between an employee and an employer, those kind of relationships people naturally understand and research Searching these relationships people naturally understand. But Topic Maps enable paragraphs. That is you can use one association to connect multiple topics together. A lot of people struggle with this. So I was thinking to myself, okay, I have learned this lesson, this is way too complex, don't include it. And then there's another lesson on that. Yes, but for some people, for some people, they are some actual advanced users using topic using contextualized. And they do find this very interesting and useful feature. So you make the feature be available, but you put it behind an advanced user interface. So people actually have to go out of their way to create hypergraph based associations. So yes, I want to be quite pure in my implementation and simplify and keep things. Keep things as easy as possible, but from a practical point of view, so many people want Asking for this this specific feature that I said, Okay, I will add it. But I will add it in a way that it doesn't compromise the user experience for less advanced users or users that just don't have that need.
Tobias Macey
0:32:13
Yeah, one of the experiences that I've had with different knowledge management systems is that they can be very prescriptive in terms of how you have to interact with them, and requiring that there are multiple metadata fields that you have to enter in order to be able to record anything or gain any real utility from the system. And so eventually, you end up having to become a expert at data entry. And you spend more time on the actual organizational aspects of it than on the aspect of just recording information and then being able to be flexible in terms of how you interact with it. And I'm wondering what your particular preferences are in terms of interacting with the information that you're gathering and being able to structure it and how that has influenced the direction that you've taken with contextualize
Brett Kromkamp
0:32:59
content. Analyze is quite an opinionated piece of software. You rarely in order to get the most out of contextualize you do need to understand at least have a basic understanding of topic map. So if you have that understanding, then I believe and based on feedback from my users, then contextualized is a relatively straightforward application to use. Also, contextualize relies a lot on defaults. So if you go and create an association, an association is probably one of the more complex things, entities that you are going to be dealing with within the context of contextualized. So in association there are at least seven fields. So think of it this way and always within the context of the current topic. Now I want to assert a relationship between the current topic and another topic. So what do I need to provide? Well, if you just use the defaults, you only have to provide one piece have information. And that is the topic identifier of the other topic. So if I'm in the Brett topic as an employee, and I want to connect to my employer, my company, all I need to do to establish that relationship is I need to provide the ID of that topic of that employer related topic. That's it. And then all of the defaults will apply to that relationship. So you will get an association of type Association. And the roles that we play in that association of both topics will play a role of related. So it's a very generic Association. But but it's good enough, you've established a relationship between two topics. Only when you see that now I need to go a bit beyond that. I want to establish a more meaningful, semantically meaningful relationship between these two topics. Do you need to start thinking about overriding the default so potentially, you wouldn't have an association of type association or you would Want an association of type employment? And no, I don't want Brett to be playing the role of related and the company to be playing the role of related. I want Brett to play the role of employee and the company to play the role of employer, you can do all of that. And if that topic doesn't exist, so because this is something that is really based on feedback from the users before, if you went to create an association, because again, everything in contextualize and in Topic Maps, you're always referring to other topics. So those roles and those types, they are other topics. So you would have this very unfortunate UX was somebody would go and create or try to create an association between two topics. And they are straightaway impeded in doing so because Oh, wait a minute. I don't have a topic for for the concept of an employee. I don't have a topic for the concept of employer. I don't have a topic for the concept of employment. So sorry. became this, oh, I have to set this up first and then create this topic and then create this topic and then create this topic and only then can I create Association. That's a very bad user experience. So what we have now in the in contextualizes? Well, first, you can use the defaults. And if that's good enough for you, every single creational form has default. So if that's good enough for you, then you find if you need to override it, then in line in place, in the actual form itself, you can create a topic. So obviously, you would type in a topic ID for for example, unemployment, the system detects, hey, this topic doesn't exist. So instead of you having to actually cancel the association creation, and go and create that topic note in place, you can create the employment topic and then carry on to the next field and then carry on to the next field. So that that is best definitely changes that have been made to something like contextualized based on on feedback, and also truly Trying to make it as easy as possible to, to allow people to build what is a knowledge domain, a model to organize the information to structure the information, but at the same time, not make them have to go too much out of their way to do it. And stopping them from being able to do all of this because they have to do all of this pre setting up of topics just to satisfy the application. That is a very bad user experience. And that's really based on feedback. I've been trying to improve on that aspect a lot, a lot,
Tobias Macey
0:37:32
given the fact that this is an open source application. And as you mentioned, one of those pieces of feedback has directed some of the user experience but how what are some of the other ways that it's exposure to the public and the fact that you're doing this development in the open, how has that influenced your overall approach to the design and implementation of the software and the user experience?
Brett Kromkamp
0:37:54
So as mentioned quite a lot of people and this is fantastic. Quite a A lot of people do just directly go to GitHub and and create an issue and ask for a certain feature or they ask for the priority of a given feature to be to be increased. But apart from that, so that's a very direct advantage that you get from having an open source project. But apart from that, just because you have put something out there and it's an IT has attracted a certain amount of people's interest, just because you have that project out there, it's set you somewhat aside, and people will take you more serious and and they will engage with you. And you over over the last couple of months. Yes, over the last couple of months, a lot of people have started conversations with me that we've eventually taken, have become email threads, or they've asked asked me to join a telegram group and so you basically it exposing yourself or making yourself available to a lot of people getting in touch with you. And then I mean, there are so many like minded people and people that have so much experience and good ideas and insights more than anything. I think that is the absolutely the biggest advantage I'm seeing of putting something like this, making it available on GitHub, having it as an open source application project and just the amount of people that start engaging with you. It's, it's absolutely fantastic. The amount of good input you get from people and and they're not asking anything in return, they are just interested they have their projects, they have their insights, they have their perspectives and and and people are sharing that. There is a huge movement now something that I wasn't even aware of and it's called Digital garden gardening. So these are people that are actually now in the open. They are creating Using all kinds of applications like motion you mentioned, there's another application called tiddly. wiki, I think, roam access. So these are people that are putting their thoughts directly their thoughts, their notes, they are doing this in the open, and they call it digital gardening. And there's a huge amount of people, I was completely unaware of this of this movement, this this trend going on. Because if you you would never think to search in Google for something like digital gardening when you're talking about personal knowledge, mass management systems. So I just was not aware. Now I'm aware of it and why am I aware of it because people somehow have found my project started to talk with me. We've taken that conversation further and and it is just this very enriching experience, which I am getting out of having my application available as open source.
Tobias Macey
0:40:56
And so one of the other use cases for knowledge management system stims is in the context of a team, whether that's a group of people who are collaborating on a creative endeavor or a team of engineers or just a group of people who work in the same company. And I'm wondering what your thoughts are on the utility and potential benefits of contextualize in that context and being used in a group setting.
Brett Kromkamp
0:41:21
So again, this was something that I've added later. For me, all of the systems are built up until now that they were really personal knowledge management systems. So it was me or a specific person mapping out a domain specific knowledge domain and they themselves and only themselves, were interacting with this this documented knowledge domain. Obviously, one of the first things that happened when I made this application available as an open source project was Yes, but I would like to be able to collaborate with my friend on this world building project that we have Yes, I think that makes a lot of sense as well. So I've added collaboration features to contextualize very much modeled on the Google Docs approach. So you can you can comment on a topic map, you can edit a topic map, or you can view a topic map and you can invite people to your topic map and give them one of those roles. But I'm not sure I'm, I'm a bit torn on this because collective knowledge management, you know, how can I say, real learning and managing one's own knowledge is such a highly personal thing, what works for one person doesn't necessarily work for another person. So there's a bit of tension here between between connected knowledge management and personal knowledge management. There's obviously a lot of room and a lot of value for being able to collaborate on a topic mapping between two or more people, and hence, that's why I added the feature but I think Also something like contextualize. It's equally valid rarely as a personal knowledge management application. But yeah, I'm a bit torn. I don't know how much they gain learning, because a lot of this is about learning. When you're using a knowledge management application. It's about documenting information. It's about documenting a certain learning process. And that is just a very personal thing. And what works for me won't work for you and vice versa. Nonetheless, it makes sense that people can collaborate on common Topic Maps or common knowledge domain. So yes, what can I say
Tobias Macey
0:43:42
in terms of the experiences that you've had building, contextualize? And some of the other topic modeling platforms that you've worked on? What have you found to be some of the most interesting or complex or complicated aspects of that and some of the most interesting or unexpected lessons that you've learned in the process?
Brett Kromkamp
0:44:00
Yeah, I must admit, I think for me more than the actual web application itself is the underlying topic map engine, which is, it's the most interesting part. I mean, that's where the magic happens. graph theory. I mean, there's the saying that everything is a graph. The graphs are so powerful imagine for for learning systems that you create an application that not only allows you to map knowledge, but it also allows you to determine the best path from one point in that map to another point in the map based on a certain amount of metadata that you either record on the topic itself or on the associations. So there are so many things you can do with graphs. And that beyond doubt is probably the most interesting part. For me, what I think is the most challenging is is the front end, not not so much the front end as in the JavaScript side of things. I mean, actually the the UX UI of This whole because again, the challenge, for me at least is to make these kind of applications as usable as possible to as many people as possible and trying to translate what is potentially quite a dry abstract subject into something that is really useful for a person and they have an application that doesn't make it more difficult than it might be. That's a challenge that that rarely is a challenge. It's probably what takes up most of my time with contextualized is trying to implement well thought out gooeys in the sense that, well, this has to work for a lot of people, they really have to understand that yes, I am putting on the table that they really also need to understand Topic Maps before they will get the most out of this application. But once they've gone through that effort, Brett then you need to make sure that you stick to your side of the bargain, and that they can actually use this application. So yeah, that's the most challenging thing for me. Is the GUI the UX of it all?
Tobias Macey
0:46:02
So for somebody who does want to get started with using contextualize what are their options for being able to actually set it up and get things up and running and start documenting their knowledge and building these Topic Maps,
Brett Kromkamp
0:46:15
okay, so obviously, with it being an open source project, first of all contextualize is available online, you can go to Qatar flowers dot Dev, and you can sign up and you can start using it, it's free, I'm not going to start even thinking about charging for it. I want people to use it. It is available on GitHub. So you, you can obviously clone the repository and try to set it up. It's not the most straightforward application to set up specifically if you're not on a Linux machine. Because one of the dependencies specifically for Postgres SQL, you actually have to have a C compiler and the appropriate header files and appropriately other libraries all of this in place so that you can actually build the specific package. package Yes, the psycho PG two package I think it is. So it's it's specifically for people that are not on Linux. It's it's quite a difficult thing to set up. We have and I'm very thankful for people that have contributed to this because I my knowledge of Docker is quite limited. contextualize is also available. There's a Docker image for contextualize. So you can actually set it up and have a Docker container up and running and add us contextualize just like that. Once you actually have the application up and running, then it's about how do I go about actually getting some value out of this application? There's there's lots of ways I mean that there really are lots of ways it depends on the audience. It depends on what what you want from this for example, I use contextualized or topic map systems in general, I could set up a topic map and the person or the people the audience that is going to be consuming or interacting with that topic map that They are doing that. And not necessarily from a creation side of things, but more from a consumption side of things. They're going to be navigating this map a lot. So contextualized also has this concept of what I'm calling knowledge paths. So you can you can create specific associations of type navigation between a topics. And contextualize will render out specific navigational buoys to easily allow you to then traverse this kind of topic map. So if the if topic map is more for navigation, then you will set it up one way if the topic map is more for you as a person to say, Okay, I have this knowledge domain, I want to actually document this knowledge domain, and I'm going to be reusing and extending this knowledge and this knowledge domain then you would potentially take another approach. Also, there are at least two different approaches to setting up knowledge domains. One is a more top down approach. approach where you are being quite formal in how you're going to be setting up the relationships between the topics, the topics themselves, what kind of information you're going to be connecting to the topics. And the other is a more bottom up approach and iterative, incremental approach. You create the topics as you as you need them, you link them up as you need them. And obviously, there's a combination of these as well. For a lot of things, it actually makes sense that you bootstrap a quite formal topic map with a set of topics and a set of predefined associations between those topics. And then from there, you start modifying, extending the topic map and creating your own relationship. So it really is. It depends on how you want to use it.
Tobias Macey
0:49:45
Yes. And for people who are considering contextualize what are the cases where it's the wrong choice, and they might be better off with a simple note taking application or some other means of knowledge management such as a wiki,
Brett Kromkamp
0:49:57
there is a certain amount of formality to Using an application like contextualized. And again, you should have a basic understanding of Topic Maps Topic Maps are seemingly quite simple and conceptually I suppose they are. But there are some darker aspects or some nooks and crannies with regards to Topic Maps that if you don't really understand that there's some things that just won't make sense. Topic Maps, for example, have this concept of scope and scope you could think of as a synonym for context or as a synonym synonym for perspective or point of view. So it's very easy to it or it's possible in something like contextualized to to create different perspectives of what is basically the same underlying data. But you need to understand that you need to understand that and you need to be able to to set up your topics but specifically the associations and the resources that you're connecting to those topics. You need to understand the concept of scope, otherwise, you won't get this benefit out of something or contraction, right? So again, there's some formality and some pre required or basic knowledge that you need to have to use something or conceptualize, to get value from it. So if you are not willing, or if it's not your need, it's not about just willingness or lack of willingness. If it's not something that you need, if it's if you are just putting a couple of notes together on a specific topic, and you want to, yeah, and if your needs don't go beyond that, then then don't use something like contextualized absolutely don't. It's overkill. So yeah, probably something along those lines.
Tobias Macey
0:51:41
And looking forward as you continue to work on the project and build it out. What are some of the things that you have planned for the future that you're excited about? Um,
Brett Kromkamp
0:51:50
probably some. Well, talking about Python we were before the podcast talking about Python and how Python has grown in so many directions. The amount of machine learning libraries that are now available to for Python so I could quite easily make it possible in contextualized. To to extract from a given. So you create a topic you put in your topic text, a document of sorts, and then you could have contextualized actually extract from that document entities. So topic extraction, and it could automatically create those topics and then automatically obviously link those topics to the current document to the current topic that is that it has used to extract those topics. So that is something I've been thinking about generation programmatic generation of topics based on things like topic extraction, also text summarization. So if you got a complex piece or a very long piece of text in a specific topic, you could say, okay, just give me the summary. One, you now have the libraries available to give you a summary. So So those kinds of things I'm really thinking about, okay, what can I do because using contextualized, now, it's quite a manual process. You have to create the topics, you have to create the relationships between those topics. You have to upload or attach resources to those topics. So that it is a quite manual thing. I think that's quite a good thing. But it's still a quite manual thing. It would be nice, and I think potentially beyond nice, it would be useful if there could be a bit more intelligence in contextualized, like automatic creation of topics and relationships between those topics using machine learning language language to do so. So yes, that's probably one of the next things I have in mind for 2021.
Tobias Macey
0:53:44
Are there any other aspects of the work that you're doing on contextualized? Or the overall space of topic modeling and knowledge management that we didn't discuss that you'd like to cover before we close out the show?
Brett Kromkamp
0:53:53
No, I think we've discussed the majority of things. I think a lot of People could benefit from and not necessarily contextualized per se but these kind of applications so I I really encourage people, especially people I mean not to isn't nowadays so many of us are so called knowledge workers, we have so much information at our disposal, we are in this what's called this info blocks, we have too much information potentially available to us. So having an approach and the tools to be able to, to manage that and to be on top of that, I really just recommend it to people not necessarily contextualize. But use something like notion, use something like tiddly wiki, use something like Rome research, use something that contextualize, I think a lot of people more than they would expect and think a lot of people can benefit from these kinds of applications.
Tobias Macey
0:54:49
Well, for anybody who wants to get in touch with you or follow along with the work that you're doing. I'll have you add your preferred contact information to the show notes. And with that, I'll move us into the pics and this week. I'm going To choose the tool pedantic, I started using that recently on one of my projects. And I've been enjoying using that for being able to constrain the space of available options for inputs to different other functions. So certainly a great library for just building out specifically typed objects so that you can pass them through your program. And then in conjunction with that, I've been using my PI just to handle the simple errors that I make in terms of you know, this function doesn't accept this type of thing. So pedantic and my PI has been great for some of the recent development I've been doing. Definitely recommend checking those out. And with that, I'll pass it to you, Brett, do you have any pics this week?
Brett Kromkamp
0:55:35
Um, difficult one. But I think
0:55:40
a subject that I think is touching all of us, and I mean, globally is probably what is happening now all over the world. The killing of George Floyd, the Black Lives Matter movement. This is hugely relevant for all of us as as a white heterosexual guy. I'm very aware that I'm playing Game of Life in easy mode. My daughter's just because they will be woman will be playing the same game in a more difficult setting. People of color are playing the game of life in an even more difficult setting. And this is not how things should be. And we need to progress in our societies and in our communities. And we need to treat people fairly. So this is something that has to be hugely important for all of us. I just wanted to say that it's happening. We're seeing it happening specifically in the US now but people are protesting all over the world. Even in the small little town that I'm living in, in the north of Norway. People are very aware of this. They are protesting or making themselves heard. And I think it's a good thing. We have to make sure that this is important for us. We have to make progress on this. People have to be treated fairly. Everyone has to be treated fairly.
Tobias Macey
0:56:50
That's it. Definitely something worth calling out.
Brett Kromkamp
0:56:54
I hope so. I hope so. I don't I sorry. I don't want to be political. I really don't. But I think This is something that really is important for all of us. Yes,
Tobias Macey
0:57:04
absolutely. Thanks. Well, thank you very much for taking the time today to join me and discuss the work that you've been doing on contextualize. And in the broader space of topic modeling. It's definitely an interesting project and an interesting problem domain. So I appreciate all the work that you put in there and I hope you enjoy the rest of your day.
Brett Kromkamp
0:57:19
Thank you very much to us. Thank you very much.
Tobias Macey
0:57:24
Thank you for listening. Don't forget to check out our other show the data engineering podcast at data engineering podcast comm for the latest on modern data management, and visit the site at Python podcast.com to subscribe to the show, sign up for the mailing list and read the show notes. If you've learned something or try it out a project from the show then tell us about it. Email hosts at podcasting a.com with your story. To help other people find the show. Please leave a review on iTunes and tell your friends and coworkers
Liked it? Take a second to support Podcast.__init__ on Patreon!