Transcript
Many thanks to Rodrigo Girão Serrão for producing this transcription
[ ] reference numbers refer to Show Notes
00:00:01 [Bob Therriault]
Actually, as as editor, I just wanted to point out that since this is the transpose episode, all your assumptions about it being in any particular order are completely.
00:00:25 [Conor Hoekstra]
Welcome to another episode of Array Cast.
00:00:28 [CH]
My name is Conor and I'm your host for today and today we have with us 3 slash 4 panelists, maybe one subject matter expert and this will be a continuation of the topic that we discussed in our last episode. But before we do that, we're going to go around and do brief introductions. We'll start with Stephen and then go to Bob, then go to Adám and then Marshall.
00:00:44 [Stephen Taylor]
I'm Stephen Taylor.
00:00:45 [ST]
I'm an APL and q programmer.
00:00:48 [Bob Therriault]
I'm Bob Therriault.
00:00:48 [BT]
I'm a J enthusiast that I'm working with people on the J wiki.
00:00:51 [AB]
And I'm Adám Brudzewsky full time APL programmer.
00:00:54 [Marshal Lochbaum]
And I'm Marsha Lochbaum, creator of BQN.
00:00:57 [CH]
And I'm your host Conor C++ professional developer, but I'm a huge array language enthusiast. Learning several languages at once, and I absolutely love having these conversations. So before we hop into today's conversation, I think we have three different announcements. We'll first go to Stephen and then Bob, and then Marshall for those.
00:01:13 [ST]
There's a new page up on github, awesome-q.org awesome-q.org,it'll be in the show notes [1] . It directs to a GitHub repo which lists experienced q developers recommendations for useful code libraries.
00:01:32 [BT]
And on the (ed. reddit/)J/K/APL site [2] , Roman Kashitsyn put together an interesting approach to trees, actually a pretty standard approach, but it's a good explanation of how you can represent trees in array language, specifically J, and as a additional thank you at the end he mentioned the Array Cast, and so those were one of the the next things you could do if you were interested in language, which I really appreciated, so he gets a mention.
00:01:59 [ML]
And good news for BQN users on Linux. Which is that just released so in the past, you've been able to enable a Dyalog keyboard or an APL keyboard generally on Linux without installing anything just from the command line because it's packaged in a standard tool called X Keyboard config. And version 2.3636 of this config includes BQN as well, so that's just out this past week. And I've installed it on arch. They're they're probably the fastest that getting it out to users. I've installed it and verified the BQN is there over the months. Other Linux distros are going to be picking up, so you'll be able to use a BQN keyboard right out of the box on uh, various Linux. [3]
00:02:50 [CH]
Awesome so yeah notes or links I should say we'll be in the show notes for all of that stuff. Be sure to check those out our show notes and our transcriptions are all awesome for each of our episodes, so definitely check those out if you're interested. And with those announcements and introductions out of the way, we are going to jump in a second into Part 2 transpose edition of our rank and leading axis theory conversation. But before we do that, I've got a couple questions and before that we're going to throw it to Stephen who's got, I think, just a small update on the news we had about q licensing from Kx last time.
00:03:27 [ST]
Yeah, there was some speculation that the hiatus around licensing of the non commercial free downloaded q represented a withdrawal by Kx of support for independent developers and Kx said that's not the case and you can get a download of a non commercial use, personal study, addition of the interpreter. I've tested that since since we were last here, put in an Incognito application and got one within a few hours so it looks all good.
00:04:03 [CH]
Awesome, and I think the email we'll have it in the notes, but I think it's trial at kx.com.
00:04:07 [ST]
You can just go to the website you don't need to write an email. [4]
00:04:10 [CH]
Oh awesome, so you can just go straight to the website, no emails required. Fantastic yeah. So for those that are looking into checking out q it still is possible. Yeah, just follow the steps on the website and you should be able to get a copy. Uhm, and the two questions I have. The first one doesn't lead into the rank and leading theory conversation. The first one is well, so it's sort of a tangent off of I posted on the APL Farm discord [5] as I believe what? It's called a problem. That was problem number one of leetcode episode 269 details don't matter. But then a bunch of people posted different Q and then Bob posted a J solution that was a short URL or a long URL linked to the J playground [6] and it was looking fantastic. I did not, I mean. I haven't checked it and I don't know maybe over a month and it was definitely an upgrade from what I had seen before. So just yeah shout out to the folks working on the J playground. It's coming along nicely, but then in preparing for this episode, I was trying to find a APL wiki page that listed all the different sort of online REPLs or interpreters for the different languages [7] , and there's one that shows all the ones for APL and then underneath that there's a link that says go to a list of open source array languages [8] , which I thought would maybe have those interpreters. And and so it leads, it leads to two questions, so it's sort of 1A and 1B. Yeah, is there an APL wiki site that has a list of not just the APL interpreters, but like the K ones and the J ones, etc. And two, we have to talk about just for like a couple minutes of on this list of open source array languages page. It's got APL dialects which there are, which you know, double digit, 10 plus. And then it's got a section on K dialects, which it looks like there's six of, and then it says other array languages. Where there's three entries for BQN, one for I, 1 for J, and then a couple other languages, and I'm just a little bit conflicted on why J is relegated to other array languages as is, and in the APL dialects I'm not going to call out any of these specifically, but definitely I think some of these you know are more of a hobby projects that I've seen in like little Lightning talks at different language conferences. I will let I will just stop talking and let maybe Adám or whoever what's going on here. Questions 1A, 1B go.
00:06:47 [ML]
Yeah, well so I don't really think it's the place of the APL wiki to to be offering opinions on you know which implementations are good and which are not. Obviously, it's focused on APL. It's the APL wiki. Uhm, for K uh, I was probably the one that put in that header, and that's just 'cause it's easier to navigate if you can see all the K dialects in one place, I didn't think of that as giving more prominence to them. But yeah, they do come first 'cause that's just that's kind of a category categorization that makes more sense. I mean, I guess you could put other before. You could make K a subheading of other I guess. I didn't think about that. Anybody can change this if they really feel it's going to be a big improvement, but it should be, the goal should be just presenting the information that's out there to the reader, as opposed to editorializing to to the greatest extent that that's possible.
00:07:40 [CH]
I guess that's a good point, I just, you know, J feel like J is hidden there and a list of, as is BQN now, you know we've got to.
00:07:48 [ML]
Yeah, but I mean these, both do have their own pages as well, and they're linked on other pages when when it makes sense and so. That's not the only place you'd see them.
00:07:58 [AB]
And I I'd like to add that I think yeah, it's maybe a questionable whether the Ks should have its own table. Maybe you should just have a link to the K wiki, which has its own parallel page called running K instead of running APL and and it has long lists of web based interpreters and and implementations as well, and maybe in fact BQN shouldn't have multiple entries on the APL wiki and should just go to, uhm, that page on Marshall's GitHub pages that talks about how to run.
00:08:32 [ML]
Yeah, well, it's the it's kind of set up to show up, show the different source repositories for things so, it sort of makes sense that there are multiple rows for multiple source repositories. I mean, I've put them all together so that you can see that they're just one line. [9]
00:08:48 [CH]
So it sounds like the solution here is that I've not acknowledged that this is a wiki and I can just go in and attempt to change it as much as I would like and and then have a response from...
00:08:52 [ML]
Pretty much.
00:08:59 [CH]
I don't know who the owners of this pages are. I assume you know. Adám, I've seen you know, Adám, Marshal have done a ton of the contributions to this site, but they can say what is with the editorialization.
00:09:10 [ML]
If you put something on the wiki, it's just on the wiki that's there, it's content. [10]
00:09:15 [CH]
Stay tuned folks. When you go to the show notes to click on this link, you're going to be like well, what is Conor talking about? It doesn't look anything like. You described and that's because I will have taken my my axe and we'll see.
00:09:27 [AB]
Just remember what it says at the bottom when you, when you submit a change, if you do not want your writing to be edited mercilessly and just redistributed, I will then do not submit it here.
00:09:39 [CH]
Yeah, it's good point.
00:09:39 [BT]
A wiki.
00:09:41 [ML]
Yeah, but it is also a good policy, not just to you know, revert completely back, whatever change someone makes, 'cause obviously they made the change 'cause the current page wasn't working for them. So if you think they've done something wrong, you've got to figure out how to make it better but still make it work for them.
00:09:57 [BT]
And having done or getting into a lot of this with the J Wiki, these kind of discussions are exactly what happens in a wiki, and the quality of your wiki, and I think the APL wiki is excellent. I don't feel slighted by the fact that J isn't prominent because it is the APL wiki. But I I think this is the kind of thing that you really have to think carefully about. And the nice thing about a wiki is other people can put information into it, and then it can get adapted or edited into something that might you know if they have an extreme view, it can be brought back into more useful information fairly easily, and then you know, if it's really extreme, reverted, but it's it's it's everybody working together to try and provide a source of information, and that's I think that's the best way to look at a wiki. And the fact that these languages have wikis, I think is excellent and it it becomes a pool. The quality of the people putting the information in and the information going in makes the tool much better. So with that Conor go for it.
00:11:01 [CH]
Well, So what what I was going to say is that this is kind of funny. Is that the thing that really prompted this was that J is right below Ivy, [11] because the languages in the other array languages table are sorted alphabetically and Ivy was like a little calculator language that Rob Pike the creator of Go that that he wrote at one point. 'cause he's a big APL fan and it just it's like J is massive in terms of what it is compared to Ivy and then to confirm that they're both hosted on GitHub and I was like let's go take a look at these star counts here and sure enough, J only has 497 stars, which is still, you know, it's not not none, but Ivy has one point 1.1k, 1100, so twice as many, so that didn't really confirm what I was hoping it would, but so I think over the moral of this is everyone should go and star the J source gitHub repo on GitHub, and I mean.
00:12:06 [ML]
But I think there are. I think there are a fair number of people using Ivy as a little calculator tool like it's intended and who who think that's a good system and it's it's clearly a thoughtfully designed tool for that. It's got its own syntax and everything, so seems to work.
00:12:20 [CH]
And it uses keywords, so admittedly it's a lot easier to get started. Anyways, all right enough. Or do you got one more thing to say, Bob.
00:12:29 [BT]
Well, I was going to say thank you for mentioning J Playground, Joe Boegner and and Chris Burke now have been doing a lot of work on it and it is coming along really well. It's a really interesting thing. They've brought labs in. They're going to be able to do add ONS essentially at about half the speed of a regular J installation. You'll be able to do, i think everything that you can do in J. You'll be able to do it online in your browser, which is way beyond what anybody thought it would be and and they're working through problems right now, but but it's making big advances.
00:13:03 [ML]
Yeah, I don't get how you got half speed with Web Assembly, 'cause BQN came down like to a third or 1/4 of the speed when we. Compiled C BQN with Web Assembly so you guys are just lucky.
00:13:14 [BT]
Not the first time we've been lucky.
00:13:19 [ML]
Fair enough.
00:13:21 [CH]
All right, so that that that tangent aside, yeah, so once again, I think this is the 4th time I'm saying this links for all this stuff will be in the show notes. If you want to check out any of these sites. And with that we will go to my actual second question, which leads us into our part two of our like I said, Rank and leading axis theory conversation. [12] So if you're listening to this as a new listener, definitely pause. I guess we probably should have said this, you know few minutes ago, but pause and go listen to the last episode because this will be a continuation of the conversation we had there. Uhm, and we did probably introduce a lot of the topics we're going to expand upon in in that first episode where we introduce sort of the idea of rank, the rank operator, axis, leading axis, trailing axis, leading axis theory, and sort of the differences in the languages. But we'll start this conversation out and this hopefully is just a quick sort of answer. Uhm, 'cause I've spent a huge amount of time between now and the last time we recorded couple weeks ago thinking about the two different models of sort of BQN versus APL, and then I guess J is closer to BQN in a sense, in that they both have the insert. Uhm reduction operator and adhere closer to the like leading axis theory or maybe closer is not even the correct word to say there and sort of the big Eureka moments to repeat what I said at the end of the episode was the different ways to sum rows across languages and I won't reiterate that, but. Like the key thing that really stood out to me is that the slash in APL, without the bar, that's really the outlier, and it's the one that when you come to APL, at least it's the one I would usually rely on most of the time, but the slash bar, along with the insert from J & BQN are the ones that are sort of the more idiomatic and go back and listen to Marshall's explanation of the slight difference between the slash bar and the inserts from J and BQN, [13] umm, but my getting to my question is at one point it was talked about how having the insert versions, the leading axis theory reduction operators as sort of the base case, is the intuitive or maybe not intuitive or obvious but it's the thing that makes the most sense upon reflecting on it, because it enables you to drill down as much as possible, whereas if you use the slash version that's starting at like negative one rank, so it's starting at the rows of whatever N rank array that you have, you can only drill down to the zero rank atoms, and so my question is, how come you can't, like drill out is that is that not actually possible? Could you not just go the opposite direction and weirdly define rank such that if you don't specify a rank it works on rank negative one? If you specify you know, like, zero, it goes down, but then you could specify like negative one, negative two, negative three, or like even drop the negatives and just you know whenever it's non zero you're going outwards. Does that not, is that not a possibility like I haven't really tried to implement it or anything, but like that's that's that's my question and we'll go from there. Adám?
00:16:47 [AB]
Uhm, the concept is certainly a possibility. I remember suggesting this to Marshal a few years back, uhm? Because I found that very often you do actually want that last axis, or you do want to deal with the scalars and so on and then, but sometimes you want to encompass more. You want to do more. I think the problem lies in, it might not be clear from a definition what happens when you include more information. So first here's an example. It's if we define multiplication in the scalar case [14] and and then you have two matrices that you're multiplying together. Then, because it's defined in terms of scalars, then it's clear that you had just pairing up corresponding scalars with each other. And then then we look at multiplication, it takes all array but it really goes into into the each individual, each individual number. But if we start at the bottom level instead and define multiplication just on single scalars. Then what happens when you scale up? What happens when you give it more information when you specify somehow that it should get more? Maybe it will still do it element by element, but maybe now it becomes proper matrix multiplication. [15] So you could make a system like this, but it's almost like, it's almost like a functionality selection then. In saying so so the you would have to define it in a very different way, and I think Marshall used the word fragile in what I described, and it doesn't sound very rigorous and everything you have to define it multiple times. What happens in this case? What happens in this case? It becomes more of a special case than the generalization.
00:18:45 [CH]
Do you want to add anything to that Marshall? Is that consistent with what you're thinking in your head right now?
00:18:51 [ML]
Yeah, more or less, it's hard to figure out how to say it. Uhm, I mean I think. Fundamentally, the idea is that the rank operator [16] only goes in one direction. It can only break things down and not build them up. Uh, so the the old axis operator [17] that Iverson was working with would let you do this because if you had a sum that only works on a list, then you could say, well, break the array into columns and then sum those. But it turns out that for for a lot of operations that don't work just like plain linearly, that's not really what we want to do. Uhm, so I mean, yeah, you'd have to change the whole design. Like the idea is that the plus slash without a bar already comes with rank one basically, and so if you add another rank to it, it's still like at the bottom level is splitting your input into lists so it doesn't matter what you do with the input before that, you're gonna be working on lists.
00:19:51 [AB]
But you could have an operator.
00:19:54 [ML]
Well so if you want if you want an operator that takes a function that works on lists and applies it to matrices, then it has to pull lists out of the matrix somehow, and so it's never going to get to where you're working on the cells of a matrix.
00:20:10 [AB]
Right exactly.
00:20:10 [ML]
I mean, I guess you could, you could enclose each cell and then you get a list of cells and then apply it to that.
00:20:17 [AB]
Uh, but then there would be, so the idea is, let's say you have this reduction right, plus reduction for example then. And let's say it goes on the last axis by default. Then you could have an operator that if you that allows you to specify that if you feed it a matrix, that instead of just applying this function on each row of the matrix, it will take the matrix and split it into a list of rows and then apply the reduction between those.
00:20:47 [ML]
Yeah, so what that would be doing is not hum... Rank works from the top down, so it pushes your highest level of the array that you see, it takes it down, so if you're working on a rank 5 matrix with rank two, it'll make it so that you only ever see the rank two parts of that matrix, and what this would do, would be to instead pull the bottom up so you don't come, you see a smaller rank, but what you see is like maybe a rank 2 array of rank 3 arrays, so it's pulling the bottom up, so the the bottom level is a rank 3 array and those are enclosed.
00:21:26 [CH]
So I guess that this sounds way more complicated than what I had in my head, whereas like if we, if we stick with your rank 5 and operate it on rank 2. In my mind, if you had a a not not a slash bar, so not a leading axis there re reduction. [18] If you just had your regular slash which operates at negative one rank, specifying a rank of, you know whatever number it is. Like negative one on your slash gets you 2 effectively like operating on rank two, so it's just like instead of starting at rank N and then working your way down or 'cause I guess it goes 0 1 2 3 4 like I just. I feel like it's like you can count down or count up and like you can just convert whatever rank is to. I'm not explaining this well in podcast format, so if you have your leading axis theory reduction and you're operating on a matrix, you can specify your reduction to be rank zero, rank one rank two. I guess in BQN it will fail on rank 0, potentially 'cause you said it doesn't allow that. But like the default for the insert is to operate.
00:22:37 [ML]
Well, the highest rank that you have.
00:22:38 [CH]
On Rank 2, correct.
00:22:41 [AB]
On rank infinity.
00:22:44 [CH]
Or rank infinity. And then you drill down. But like technically you could design a rank operator that just starts at the other end and work it's it's way up, no?
00:22:51 [AB]
No, no, because the problem is the the rank operator supposedly should be general right, which means it cannot inspect its operand. It just has to do some transformation and apply and that's it and not not worry about what's actually going on. And if you have a, if you give it a function that applies to the last axis, then it will always apply the last axis no matter what you do with it. And there's nothing that as such a rank operator could do to force this function to consider the outer structure, you will always just drill in and do the left axis, but could this be done? Yes, it's good. In APL you have got bracket axis, that's exactly that. [19] Right bracket axis you can give it plus slash which works on the last axis and you give it a number in brackets, which is the dimension you want to work on, and magically it just works. The catch is that bracket axis has to be explicitly defined in the language engine for every single possible combination. Then there's no general rule you can apply. Because there's there's really nothing in common in the transformation between taking a uh, last axis reduction and making it a one of the previous axis reduction and taking a last axis reversal and making it into some previous axis reversal doesn't, so you can't really formulate a rule there. So the bracket axis is just ad hoc for every single operand that it can take. Or if you call them up, runs this questionable every single function it can take or operator it can take and it just has a specific meaning and the only way that you can generalize this is by letting the user write code that takes that axis into consideration and defines the behavior ad hoc for his function. That is exactly what, uh? The GNU API [20] allows you to do. But the beauty of the rank operator is that you don't have to specify what will happen. It's already given.
00:24:59 [CH]
Yeah, I think that makes sense. Well so and maybe we can feel free to skip this next question if it's not going to lend itself to anything understandable for the listener. Is it easy enough to describe the implementation of like a generic rank operator? Like what is it actually doing? When you say it's applying a transformation? So if we're sticking with a matrix, if you do a like Rank 0 versus rank 1 versus rank 2.
00:25:24 [ML]
Yeah, so what the building block you need because rank is kind of well, rank is the simplest thing that even does this. The building block you need is a function that encloses the K cells of an array, [21] so you'll give it the rank K. So if you give it rank one, it'll turn like a matrix into a list of lists. If you give it rank zero, it turns it into a matrix of enclosed scalars. So you need that enclosed function and what you do is you apply that function to both the arguments. All the arguments with the right rank. So that that splits them up and then you apply your function each on these arrays that are now split up so that just applies to every pair of cells. And and then you do a merge or what's the mix? However, we want to call it too to combine all those results cells together into one array.
00:26:26 [AB]
Disclose rank zero or disclose each.
00:26:31 [BT]
And and one of the things that that I'm interested in is you keep mentioning all positive ranks, but in J you can do negative ranks as well, which allows you to separate items and and whatever you can essentially not worry about what the trailing.
00:26:38 [ML]
Yeah, that's well.
00:26:45 [BT]
Part is and just break it up according to relative to where you're starting, which is kind of useful at times. [22]
00:26:51 [ML]
Yeah, every implementation I've seen has negative ranks too, [23] but you can handle those as part of the input, right yeah?
00:26:56 [AB]
It's just computation.
00:26:57 [ML]
You can write the code I've written the code that takes the ranks and converts them all to the to the correct positive ranks.
00:27:05 [AB]
So so I I think I'll try to rephrase it again. The the steps you can go through, but it's if if we break sort of like the terminology and function and and using functions from from any of these rankings and just think of it as uhm, let's say we want to to we have the first axis reduce and we want to apply it on the rows of a matrix so we take the matrix when we chop it up into pieces into rows, so now we have a bag of rows. And then on each of these items that are in our bag we apply this reduction now the reduction can't apply on the original leading axis, 'cause that axis has been destroyed, it's gone it's not there, right? We only have arrays. We reduce each one of them, and now we've got a bunch of numbers and then we take all these numbers that we reduce the matrix to and assemble them back in the order that they came from originally. And since it's a bag of individual numbers, that's called the vector or list or whatever. It's, it's just it's a matter of splitting it up applying the function on each and then stitch them all back together where they came from. That's all the rank operator does.
00:28:20 [CH]
So that in that example that one is like pretty straightforward 'cause it's so you're doing a rank one first axis reduce in APL, speak on a matrix which means like if you're coming from a functional language, there's a function called chunk where basically because your your 2D matrix is stored in row major order. You just do chunks of whatever the dimension of your rows are and then you apply each operation for the rank 2 case where you're doing a column wise operation effectively on your matrix. Uhm, what happens there? Does that mean you're effectively doing a transpose and then a chunk? Or are you doing something different? 'cause like transpose will flip it so that you go from column wise to row wise and then do the same thing because and this all comes from this is like this is this is this is like the pod. This is the like. I don't know if it's what universe it is, but it's some kind of array podcast universe where like on my other podcast Bryce and I have been talking about multidimensional iterators [24] and we have a couple more episodes coming out about that and uhm, we've been talking and Marshall you have to come on now 'cause I said something that you said and then Bryce was like that's not right and I was like, well, Marshall is probably right. I'm just saying the wrong thing that he didn't say anyway, so there's a bunch of crossover if we're going to show up in other people universes anyway, so I've just been thinking a lot about like this. You know the the default of an insert is to do column wise, but actually like that is striding element like you're summing elements that are strided from each other. They're not actually contiguous in memory, and anyways I guess this is the lead into I just mentioned transpose without realizing that that's actually like the Part 2 of of this conversation, so uhm, Adám, go ahead or or Marshall go or whoever.
00:30:10 [BT]
Bob, actually as as editor i just wanted to point out that since this is the transpose episode, all your assumptions about it being in any particular order are completely...
00:30:24 [AB]
But we'll just reorder our our inputs hereinafter ones. Well, you mentioned applying it on rank two. So if we have a leading axis reduction and you want to apply it on rank 2, then conceptually you can on the matrix. Conceptually, what's happening here is we're chunking our matrix into subarrays. Of rank 2. Of which there's exactly 1 the entire matrix, so you could think of it as we take that entire matrix, stick it into our bag, and then we loop over it and we apply this reduction, leading axis reduction to each item in our bag, of which there's only one but it doesn't matter OK, and now that got reduced to a vector because that's what happens when you reduce a matrix first axis and we take that vector and we stick it right back into the collection array it came from. And since there is just one thing, so there's just one vector, so that's the result that you get.
00:31:17 [CH]
So does that mean if my understanding is correct, that a rank 2 first axis reduction on a matrix is less efficient than a first axis reduction on the matrix, even though they're functionally doing the same thing. The rank 2 requires you to do that enclosing, et cetera, et cetera.
00:31:32 [ML]
Yeah, it's just a big margin.
00:31:34 [AB]
Work well, you don't actually enclose, but yes, last last axis reduction is efficient because the numbers you you're summing in this case contiguous in memory. It does not do striding even.
00:31:44 [ML]
Yeah, a decent implementation of the rank operator should see when it's just when it's the same as just calling the function. I mean, even if we're going to enclose that whole array and map over it and mix it, that's still basically no work because you know creating
00:32:01 [AB]
create a pointer, then follow it. Then come back again and remove the pointer done.
00:32:04 [ML]
Yeah, that would be a very small cost.
00:32:06 [CH]
I missed what Adám or I didn't understand what Adám said though about the first axis reduction on the matrix. The elements you're summing are contiguous, like that's not.
00:32:15 [AB]
No, not first exit last axis, sorry.
00:32:17 [CH]
Our last axis.
00:32:18 [ML]
The last axis reduction the elements are contiguous. Any other axis that you have to have a stride, [25] but having a stride is actually better because that that means the operation is more parallel. You can so you can just take if you're adding two rows together, you can take this vector, add it to this vector. If you're summing a single row, I mean some is, you can parallelize some because it's it's commutative and associative, but at least on integers for other operations that might not be the case, so you you actually do have to go one at a time. And so generally it's a lot better to do reductions along the first axis in terms of performance. Uhm, there you still might have the chance like if you have many rows, you might actually want to like do transpose sections so that you can then use vector operations and then transpose back. But that gets really complicated very quickly.
00:33:17 [CH]
This has all been very illuminating, at least for me. The host. I'm not sure how the listener is doing, and I apologize if I've just caused massive confusion, but this is great. I just learned a lot and got that clarified. I'm not sure if there's anything more we want to say on this topic, or at some point someone said now perfect transition to transpose because I have no idea why that's important in the space of leading axis theory.
00:33:42 [BT]
Well, if you wanted to change your leading axis, that's how you do it with transpose.
00:33:49 [CH]
All right episode over.
00:33:54 [AB]
Happy array programming.
00:33:58 [CH]
Go ahead, Marshall.
00:33:59 [ML]
Yeah, yeah. So if you're doing a reduction on the array, that's a case where you only ever work on one axis period and leading axis model lets you pick out that axis and then you are done. You never need to transpose it around where you need transposes. [26] If you're going to have multiple arrays and you need to align the axes together and shift axes around or something like that.
00:34:21 [CH]
So you're saying? Uh, you're doing something dyadically instead of monadically? Is that, uh, another way of phrasing what you just said? OK, so is there like a concrete example?
00:34:30 [AB]
That, well, even the monadic example can be given for where you need transpose first. So I I think it may be a nice example would be. Let's say we have a two by three by four.
00:34:43 [CH]
Right I was going to say like if we stay in the world of matrices and vectors, is there any use? Do you have to go to the world of height of cubes so.
00:34:51 [ML]
That's for sure. Well, you never, ever need dyadic.
00:34:54 [AB]
Because because they're the same. I mean, well, actually you might use it for the diagonal, but.
00:34:58 [ML]
Well, there's only there's only one transpose 1. Not one transfers is not a no op on a matrix and there are no transposes that aren't a no op on a list, so. And we should probably define this. So in APL and also like if you're working with tensors, a transpose is rearranging the axes of an array in any order, so shuffle them all around. And of course, for a matrix there are two axes. So how do you rearrange 2 things? Well, either you just leave them there, in which case hopefully the implementation figures out wow, I'm the programmer has asked me to do nothing. I can do that.
00:35:38 [AB]
Feel fast too.
00:35:39 [ML]
Or you swap those axes around. And so you can also similarly, you can transpose a list, but there's one axis and the number of ways to rearrange 1 axis are severely limited, so. That's the idea with dyadic transpose versus monadic transpose.
00:35:55 [CH]
Wait, that's the I mean. I understood everything you said up until that lasts. That's the difference between monadic and dyadic transpose. [27]
00:36:02 [ML]
So dyadic allows you to rearrange the axes in any order at all. Well, I I changed it in BQN. In APL and J it reverses all the axes completely. So you have axes ABCD and it turns them into DCBA.
00:36:21 [CH]
So it performs a reverse on a shape.
00:36:23 [AB]
Yeah so yeah so the transpose of the transpose in APL and J gives you back the original array always.
00:36:30 [CH]
Yeah, that makes sense.
00:36:32 [ML]
So, in BQN, what I did is instead takes the first axis and moves it to the end which is simpler generally, and it gives you more options 'cause you can rotate it by any amount by transposing multiple times it has a lot of cool properties.
00:36:48 [AB]
Then you can apply transpose inverse to go back again.
00:36:51 [ML]
I think we probably want to talk about, you know why you would want this completely arbitrary transpose reordering axes? Any which way before we go in to why that particular kind of transpose might be useful.
00:37:03 [AB]
Yes, so let me bring this example and I like using a sample of two by three by four right because we don't have unit length axis and they're all unique, so we can recognize them. And then there's also a human countable number of elements we got there. So we got a 2 by 3 by 4 array and we want to flatten it somewhat, and for that I would use ravel. [28] Well, now BQN has other ways to do this as well, I think. But but let's just use ravel.
00:37:33 [ML]
Well, you can merge 2 axes, but probably you just want to flatten.
00:37:36 [AB]
So so ravel it takes an array it's it's kind of leading axis in the sense that it takes an array of any number of dimensions, and it makes it completely flat. It just makes it into a vector. There are different ways that we could flatten this array partially right. We could flatten it all together. Let's just ravel it down. [29] But let's say we want to get it down to two dimensions instead by combining some axes so we have a 2 by 3 by 4. We could combine the leading 2 axis, so that means we have two layers and each layer has three rows and each row has four columns. So in a sense we have two tables like 2 matrices and you could join them so you have a single giant matrix that has all the rows from both matrices together, so it says we have the two by three by 4 array. Then it would be a 2 by 3 meaning 6. We get 6 rows altogether and four columns. And so now we need to combine the two leading axes. If you try to ravel, then if you ravel the array as is, then you get a 2 * 3 * 4 elements. In a single row, that doesn't work. If you try to apply the rank operator then you can say, oh, I want travel to only look at some subarrays so I can choose Rank 0. That would add an axis, definitely not what we want, 'cause it makes the scalars haven't axis. We could do it rank one that ravels at all the vectors that doesn't do anything. It's a no op. We could do rank 2. That each of these tables we had two by three by four. So then we get 2 by and then three and four gets ravelled out that's 3 * 4, so we get a 2 by 12 array, which is not what we wanted either. Or we can do it rank 3. And now we look at letting ravel act on the entire array and that we're right back to raveling the entire thing. So now we get 2 * 3 * 4 elements in a single row thing vector, so how are we going to do it. The way we can do it is we could see that we are able to use the rank operator to restrict the vision of ravel, so it can only see subparts of our array, but the axis that we want ravel to see are the first two axes and we can only restrict it to see last axis. So using transpose we can move the axes around. If we take the first two axes and make them into the last two axes, which means that the original last axis becomes, becomes the first axis, keeping in mind that the last axis is the one we want to preserve, and we want to still have 4 columns. Now we can apply ravel to the last two axes together. That would be ravel rank 2. And that gives us a two by three, merged together to become six elements. Now we have a four by 6 array, but we actually wanted our 6 by 4 Array, so we use transpose again. To swap these two axes around, move the now trailing next six axes up to the front and then now leading four axes to the end. We get our 6 by 4 array. So we used transpose twice with ravel and rank to do exactly what we wanted.
00:41:06 [BT]
Well, one of the things I was going to say is this is this is when I think BQN has it as well as you got under [30] so that you can you can do this is almost like a just a two step where you transpose and then and then reverse the same transpose coming back out. It's an under so it's under transpose and you can.
00:41:22 [AB]
Well, does that work in J?
00:41:26 [ML]
Yeah, 'cause you'd have a J is going to end up flipping those last two axes too, so you'll get the shape you want, but you'll have the elements in a weird order.
00:41:35 [AB]
So it doesn't work in J under, not like that it works in in BQN because it's rotating the axis and
00:41:42 [ML]
so you want to put the first two axes last? That would be transpose inverse, but you can do under transpose and inverse.
00:41:48 [BT]
Yes, and actually in J you don't have to specify all three dimensions to change the you can.
00:41:48 [ML]
And that is.
00:41:54 [BT]
You can do the you know if you just give 2 dimensions it just shifts it over.
00:41:55 [ML]
Oh, use a dyadic tensor.
00:41:58 [AB]
Yeah, yeah, yeah, but they're considered the same whether using under operator and that really doesn't matter. The important thing is that we are transposing twice. We're transposing so that the axis that we want to restrict our function to apply to go to the end. Then we're using rank to restrict that function, and then we're transposing back again if necessary. It's not always necessary to transpose at the end, but they often lives.
00:42:21 [ML]
Bob is going to stay on topic, but I have to say that J&Q and also do have the easy way to combine that first axis together. That's what I was talking about with the combining specific axes. You can do a join or in J it's a catenate insert. [31] And that will catenate all the cells, which is that's the idiomatic way, and APL has a you can do comma with axis. I don't some people who would consider that some idiomatic, and some people wouldn't, so there is a way.
00:42:53 [AB]
Yeah, but that doesn't matter. I'm using ravel here because it's easy to understand how the how the shapes are changing around. And yeah, I wouldn't do this using ravel maybe, but I would use it with other functions.
00:43:01 [ML]
Yeah, that makes sense.
00:43:05 [AB]
That way you don't have it on the way to do it.
00:43:07 [CH]
So is the way to spell what we've been talking about in APL, transpose, paren, comma rank 2 end paren, transpose.
00:43:16 [AB]
You don't actually need parens, but and the transposing isn't right.
00:43:24 [ML]
The transpose at the end of the the first transpose that happens, which is at the right, needs to be a dyadic transpose. [32] So that was that's why we're bringing up the dyadic transpose.
00:43:34 [AB]
So the important thing is you want to move the axes is around, right? You want to move the first two axes to become the last two axes and the last axis to become the first and the way you specify that is you tell transpose where you want your axes to move around in so you so and I've had to be careful with index origin here but uhm. You want the the first axis to be number 2. And the second axis you want it to be number 3. And the third axis you want to be number one. And then you can ravel rank two. So yeah, you can put that in parenthesis if you want. And then you can transpose it back again. So transpose open paren ravel rank 2 close paren and then in index origin one. It would be 2 3 1 transpose and then your array, or if it was index origin 0, 1 2 0, transpose.
00:44:42 [CH]
Yeah, that worked because I was going to say before I was using two monadic transposes and it was giving me a very odd order of things. So the real spelling is transpose paren ravel rank 2 N paren, 2 3 1 transpose, and then your matrix.
00:45:05 [AB]
And so that 2 3 1 a lot of people get that wrong, because, uhm, how does it go? They think of it as where they want the axes to come from. But it's actually where the axes need to go. So the 2 3 1 means for the two is the first element means the first axis should become the second axis, 3 means which is in the second position, the second axis should become the third axis and the one which is in the third position means the third axis should become the first
00:45:35 [CH]
axis, yeah, it's the classic scatter gather. The difference between those two one you're sending the other one you're bringing.
00:45:47 [AB]
Well, it's the inverse permutation if you want.
00:45:50 [CH]
That's true as well.
00:45:53 [AB]
So if you if you do at the end monadic grande on the 2 3 1, that gives you 3 1 2. So that that tells you where they came from instead. So that tells you for the first position this axis comes from position 3, the second axis comes from position one, the third axis comes from position 2.
00:46:14 [ML]
J does it the other way around though right, it tells you where each axis goes to.
00:46:21 [AB]
Really, yeah wow, I didn't know wow.
00:46:23 [ML]
So that's more intuitive, but the APL version is more general because because it allows you to combine axes, which we wouldn't really call a transpose. But it's another thing that you can do with it. Uhm, if you have like even on a matrix, if you have two axes and you say i want to send these to the same place so you'll say 1 1 transpose of a matrix or 0 0. What it does is actually take the diagonal along that axis. So the index in the result turns into 2 indices in the argument that it's taking from and so you get you end up with the diagonal, so you get element 1 1, element 2 2, element 3 3 and so on.
00:47:07 [BT]
And the way J does that is slightly different it you box say zero and one and as soon as you box them it's just taking whatever times they're both equal. So box 0 1 in a 2 dimensional matrix gives you the diagonal. [33]
00:47:22 [AB]
I see. So same elect from both of these to this position.
00:47:27 [BT]
Yeah, and it doesn't care about the order of the zero and the one once you've boxed it doesn't make any difference. It's just going to look for the equality and take your diagonal. The interesting thing about that is you can take a larger number of dimensions and specify what diagonals you want to put together. So that gets there's an interesting shape, 50 Shades of J we can link to it [34] . It really goes into great depth about this and it's.
00:47:55 [AB]
Is that functionality any different in APL?
00:47:58 [ML]
Listen, I think it should be the the same.
00:48:02 [AB]
Yeah, I think you can have as many ones as you want in the left argument to transpose.
00:48:07 [ML]
Yeah, and you can combine you can say I want all these axes to turn into result axis one and all these ones to turn into result axis two. You can do that either way.
00:48:18 [AB]
Yeah, it just means that because it uses a flat left argument, then you can use grade to switch between the two representations when there are no duplicates, whereas the J version you can't do that. So you would have to use key or something. [35]
00:48:32 [BT]
Yeah, there's couple well yeah, my key might be a way to do that?
00:48:36 [ML]
So in BQN I did go with the with the APL way, because it's it has the generality kind of built in, and it's simple, but the magic of having a 1 character inverse is that I pretty much get both, because if you want to do it the J way, at least with an unboxed array, you just write out all your axes, and you do transpose inverse so that's pretty nice.
00:49:01 [CH]
I have to say I absolutely love the convenience operators like inverse [37] and cells [36] in BQN. Yeah, there's something about when I see rank and then a literal number or array. Kind of.
00:49:19 [ML]
Well, and I guess technically each is a convenience operator too, so those are the three and everyone likes each. [38]
00:49:24 [CH]
It is, yeah. I mean not in J. [39] That's a good question because because in both BQN APL, it's just a double dot. I get I've always had the sense that the reason it's spelt out like that very like, so each for those that are not familiar in J is actually each and it's a what do they? Call it as.
00:49:47 [CH
Yeah, the word each yes or EACH
00:49:50 [CH]
it's it's defined in.
00:49:52 [AB]
Meaning the word each. Standard library word or something.
00:49:57 [BT]
Yeah, well, it's because it's it's a specific way of of you're you're you're working with your unboxing and then you're doing an under onto whatever you're going to do, and then you box it up at the end again. So the reason it's written out is each is because you could do different things with under, but it's very convenient to do boxing in J with with under, and that's what each is.
00:50:19 [AB]
It's just under unbox.
00:50:21 [CH]
Ampersand dot greater than, so the greater than is the unbox. The ampersand is composition.
00:50:28 [AB]
And present dots and dot.
00:50:30 [CH]
So is the under.
00:50:30 [AB]
Yep, that's under.
00:50:32 [ML]
Yeah, so the definition is under box ampersand dot greater than.
00:50:39 [AB]
So really what it's doing is
00:50:41 [BT]
it's initially unboxing whatever it's given it unboxes works on that unboxed version and then re boxes it, and if you got the colon afterwards and you're working infinite it boxes the whole thing, whereas if you've got just a single dot, then what it's going to do is it's just going to do the reverse of what you how you came in. It is how you go back.
00:51:01 [CH]
And what was not well I was about to say not to skip back, but that's exactly what we'll do here. With this question. You said in j Yeah, you said in J and BQN there was a convenience way of spelling the 2 3 1 dyadic transpose. You could just do a something on items or something.
00:51:07 [BT]
Welcome to transpose.
00:51:19 [AB]
There's only with ravel, right?
00:51:21 [CH]
This is, yeah, going back to the ravel example. What was the way of spelling that?
00:51:27 [ML]
It's uh catenate insert is how you call it in J and BQN it would be join insert. [40] And so what that does is insert means you're applying your function across all cells. And catenate just combines them along the first axis. So overall you get to combine all cells along the first axis. And this is pretty much the same as K's. You'd call it flatten is what they call the one that does that the.
00:51:57 [AB]
It's just it's concatenate reduction, right? OK.
00:52:00 [ML]
Yeah, yeah, so it's comma slash in K. Where I don't remember the names for the primitives, but you're joining lists. And K doesn't like, hey, naturally does things along the first axis 'cause that's just the outermost layer.
00:52:16 [AB]
And then and then in APL, the traditional way of doing it using that bracket axis again, it's comma bracket axis iota two close bracket. [41] And again, it's ad hoc, right? Just you cannot reason about this, so don't even try. It's just there's a there is a definition that if you use the square brackets on the right side of of comma, monadic comma with some numbers for X axis inside the square brackets, then this is what it does is it combines those axes together, ravels those together and then but I'm actually proposing in the middle of proposing a Dyalog that we add a a function in monadic, I like to call it demote and which takes the two leading axes and combines them like. And then if you want to combine more axes, you could apply it multiple times. And if you want to combine any internal axis, you could use the rank operator to do so. But of course you want to combine disparate axis you'd have to transpose first.
00:53:25 [ML]
Yeah, well, that's that's exactly what the the comma with axis doesn't Dyalog it actually does, so you can combine any axes and they might be spread out across the array, but what it does under the hood is it just transposes them all and then does a mix or a split on some sort of split on the result of that?
00:53:48 [CH]
Yeah, and this is where the difference between the reduced first in APL and the inserts in J and BQN show up because for the listener I've been to aid my understanding of this very simple topic. If it is not clear to the listener, I am being extremely sarcastic. I I have been typing along in RIDE, J and BQN. [42] And for the convenience ones yeah, catenate insert, which is like a little looks like one of those styrofoam thingies that comes in a package sideways S. Uhm, catenate insert and then two. I don't know what you call that sort of underbar 3 underbar 4 shape range 24? That's pretty nice, because basically it's just catenate insert and then your matrix. You switch over to J and you can just do comma slash i dot 2 3 4. 'cause iota in J has the nice feature where you can reshape in the iota. [43] I don't think BQN has that, right?
00:55:01 [ML]
Correct, yeah, I really miss it when I'm writing code examples, but I don't think I miss it any other time.
00:55:07 [BT]
Yeah, it's great for code examples.
00:55:09 [CH]
And and then APL, if you go over and you know previously we had the spelled out, transpose paren comma, rank two, paren 2 3 1 transpose, and then your matrix, which is compared to the two examples we just spelled out in J and BQN. Pretty awful if you switch to the ravel, comma slash bar, so axis first, reduce and then your matrix 2 3 4 shape iota 24. You don't end up with your 6 by 4 matrix of iota 24. You end up with three by four matrix of enclosed lists of length two, starting off with 1 13 2 14 going across. And I think this is what was explained in the last episode of how the difference between was it this is the APL, the Sharp APL way of doing things. [44] And J's and BQN's is consistent with APL2.
00:56:07 [AB]
No, the opposite.
00:56:09 [ML]
Yeah, J and BQN are following Sharp here. J is closely related to Sharp APL. [44]
00:56:09 [CH]
Or the opposite?
00:56:15 [AB]
Conor if you if you take the conversion, for example and where you have that Styrofoam thingy and and the two, then the double quote. If you stick on each in between the two, then you would get exactly like you wouldn't in a and the same thing goes for J, by the way, if you if you do this within each
00:56:34 [ML]
well J would start boxing.
00:56:37 [AB]
No, it's just two comma each reduce change I dot 2 3 4 and J that gives you the same thing.
00:56:45 [ML]
Yeah it does, but it's like.
00:56:46 [BT]
Yeah, but that's kind of.
00:56:47 [ML]
It's a lot different going from a simple array to a boxed array like there's this extra level where BQN's [inaudible]
00:56:56 [AB]
but basically it doesn't apply each APL's reduction.
00:56:57 [ML]
It's still.
00:56:59 [AB]
Yeah, you cannot get around. Hey you have to define it differently.
00:57:05 [ML]
Yeah, but but they didn't have any boxes in the array before, so it's it's actually adding boxes there.
00:57:12 [BT]
Yeah, it's it's it's all it's changing the type of what you end up with coming back out, which is a bit more than I don't think in APL you're changing your type at all or Adám are you at that point with you reach?
00:57:23 [AB]
I don't understand what that means.
00:57:25 [BT]
Well, I'm going from a say for instance I had integers going in I dot 2 3 4. I'm going to end up with an array of boxes now because I'm I've I'm I'm each now will box what my result is can't get around that.
00:57:42 [AB]
Yeah, but if I if I do the concatenate reduction in APL you also get enclosures.
00:57:48 [ML]
Yeah, so I mean, I guess it it probably is not actually very very interesting in this context, but like if you do minus each on a numeric array in J, you don't get the that's not the same as just minus. It adds boxes to everything, so there's this extra level, although APL and BQN adding the extra level of array depth anyway, they're just doing it in the catenate, rather than into each. [45]
00:58:17 [AB]
And then yeah, and then you can in APL you can try.
00:58:20 [ML]
Yeah, so that that's really settled.
00:58:22 [AB]
It doesn't really matter for what we're doing here uhm, because this is only because we're calculating things. If we're using any other function, that's wouldn't really be relevant. And then yeah, so the traditional way of doing it in APL, is is comma brackets iota two close bracket?
00:58:38 [BT]
So I've got a question, probably because I don't understand necessarily the implementation of transpose as well as I probably should. When I actually do a transpose am I copying the array into a different space so I would be changing the stranding and everything as well?
00:58:54 [AB]
The strides.
00:58:54 [BT]
Well, the the way the the tight so the sorry strides that you said.
00:58:55 [ML]
How strides open?
00:58:58 [BT]
Yeah yeah, sorry.
00:58:59 [ML]
Yeah, every APL and J and BQN all all actually perform the data movement. APL can sometimes do this in place. I don't know if J does it, it's not really a huge, it's not a hugely valuable optimization. And you can only easily do a transpose in place if it's if you're transposing axes that have the same length. Sometimes it happens in in place in Dyalog APL, but it doesn't really matter and you are actually moving all the data, creating a new array that has this new ordering as opposed to, I think numpy might actually do it [46] , like virtually where it just where it it basically says well now axis one is axis 2 and so on. 'cause it's array representation is complicated enough to have strides that don't go strictly in leading axis.
00:59:55 [BT]
And the reason I ask is because in that last ADSP episode that I listened to, and that's kind of what Bryce was talking about, was that you know, sometimes you need to change the order of your transpose for us transpose to be able to make full use of parallelization because you're actually changing the order, yeah?
01:00:20 [ML]
And so that definitely can be the case. I mean, most primitives are the actual primitive is going to be faster than transposing the array, but if you're going to do a lot of operations together, it definitely can be faster if you transpose an array first and get a more favorable ordering. Usually that's that's using plain leading axis operations instead of working with rank.
01:00:45 [BT]
And and that's what I've always been told is, that's your choice of what you put on the different axes is really important going in because you can make your your implementation much quicker if you choose them appropriately and not work against them through every operation that you do. And if you do happen to choose inappropriately for one section of what you're trying to do in your program, sometimes the best way to do it is to do a transpose. Do all the stuff that you want to do and then transpose back because it'll still be quicker than trying to fight your way through something that's trying to you you're working against leading axis the whole way through.
01:01:23 [ML]
Yeah, that definitely can be the case. And there are some like the Fourier transform is 1 in particular, where often you want to transpose the data as you're working with it. The Fourier transform [47] does something along every axis, like the just basic implementation you would require your input to have length that's a power of two, and then you split it into a two by two by two by two and so on array and and the you might be able to work with that like you want to do something along every axis. So the two ways to do that are kind of two. Either actually apply this function with a bunch of different ranks or to apply the function and then transpose 1 axis to the end and do it along all the leading axes that way. So it kind of depends on what your language has support for doing quickly, but the transpose can often be a good way to do that, and that's also BQN transpose that just moves one axis to the end. It's a lot better for that because after you do it enough times, you've rotated the whole array around.
01:02:33 [CH]
I've been trying to code something to confirm my understanding of how this stuff all works. A 1 rotate on the find of shape of your matrix. And then doing a dyadic a dyadic and I think it should work I just haven't gotten it to work, at least that's how it works in BQN. So like like the keys here, that dyadic transpose takes the trailing leading axis moves it to the end aPL and J reverses it, correct? [48]
01:03:14 [AB]
Monadic transpose.
01:03:16 [CH]
For dyadic for dyadic transpose.
01:03:20 [AB]
Yeah so sorry yes.
01:03:21 [CH]
I'm trying to model the monadic transpose with the dyadic transpose.
01:03:26 [ML]
So I don't i think the your finder, the shape thing is kind of you would probably want to take the rank, which in BQN is equals and then do range of that or iota of that 'cause if you do like if you do classify of the shape, that's going to like if you have two axes that have the same shape, that's going to give them the same number and then you'll have a... Everything will break.
01:03:45 [CH]
Right, right, right right, you can just do a range of rank. And then one rotate that and then do a dyadic transpose instead of unnecessarily doing what I'm trying to do. Alright, so I guess the the the closing question is i mean, I'm kind of sad you know this is a this is like the of my brains hurt thinking about APL in a long time i've I've always heard about dyadic transpose is kind of like a joke in the array community. Assumably there's utility in dyadic transpose. Even even transpose on higher rank arrays. I mean matrices i think most people can visualize you just taking the two corners and flipping them like a bed sheet. Yeah, is this? This is good but double thumbs up to all this stuff I guess like you know?
01:05:01 [AB]
Well, it's not just good. I mean it's pretty much necessary. It gets really complicated. If you were to write the corresponding code without dyadic transpose all the time, or transpose.
01:05:10 [ML]
Well, it does depend on how many axes you have. If you don't work with data that has a really high rank then you may be fine without transposing at all. You're almost definitely fine with just just the monadic transpose.
01:05:26 [AB]
Well, if we're really high means three or more, right?
01:05:31 [ML]
Uhm yeah, but I mean the even if you worked with like occasional arrays of rank three, you might actually be doing a lot of your processing at rank two and stuff like that so so you won't necessarily need to transpose those.
01:05:45 [CH]
How often is it that you are explicitly writing out the a left argument to a dyadic transpose as opposed to what I'm trying to do here in BQN, which is using the after operator to basically you know, define your left argument as a preprocessor function like you know the the range on rank with a one rotate, uhm, so then you don't actually need to explicitly spell out that left argument, you can just it's some it's derivative of the current matrix or higher rank array? How is this classically used? Is it people spelling out like the left argument?
01:06:35 [AB]
Not sure I understand this question.
01:06:37 [CH]
Well, so like in our example, we spelled out 2 3 1 as the left argument to the dyadic transpose, but like what I was attempting to do earlier, was like you could technically uhm, not spell that out and just do an iota on rank, so the shape of your shape and then do a 1 transpose and APL to get that? Or, sorry, uh, one rotate to get that.
01:07:04 [AB]
Oh so you're asking how often do we spell out what the the left argument rather than commuting? It's computing it to be there. I would i always spell it out unless I'm trying to write some kind of utility that well, I don't know in advance what I'm doing.
01:07:23 [ML]
Yeah, I think it's pretty rare to have a to have a variable rank argument that you really genuinely don't know anything about like what you want the axes to be.
01:07:34 [BT]
The reason you would transpose is because you you know the shape and it's not fitting what you want to do with it. So you would start with knowing the shape well.
01:07:41 [ML]
Like in the Fourier transpose example, you know the shape is all twos, but you don't know how many of them there are. So I mean BQN lets you do that one axis transfers nicely, but I think in APL and J your best option is is to compute left argument for it. Uhm, I don't know if there's a better way.
01:08:04 [AB]
Yeah, I just looking at GitHub now has this well you can you can be included invited into it this new code search thing. And so I searched for the regex of a digit followed by transpose. Let's see what I could find and it it has 231 transpose. Then it has some diagonals, more 2 3 1, 2 2 1 3 transpose. And and yeah, and 1 3 2 transpose so and in all those, yeah, yeah, and almost of the cases where it has the an actual left argument. That's not just for the diagonal, then it's followed by and raveling the leading 2 axes so that's but that was only in in Dyalog's, Dyalog's repositories, which aren't really representative of production code I guess. And if you search everywhere but across all repositories, then we can find Jay Foad in some advent of code using a 2 3 4 1 transpose. So that's an example where it would benefit.
01:09:15 [ML]
That's still just transposing, because then.
01:09:17 [AB]
Exactly, that's what I'm about to say that that is an example where you actually want BQN transpose. Find 1 here that says 1 0 2, transpose and so that's flipping the first 2 axes and so there.
01:09:31 [ML]
So that's BQN lets you drop the trailing elements from your from your left argument. So you can write that as one transpose, yeah.
01:09:39 [BT]
And in J you could you could have your dyadic transpose if you just put a zero in front of it. It's going to move that zero axis to the end and is if you you can do it that way.
01:09:52 [ML]
Oh yeah, so that does through the BQN thing. But that doesn't do this one that we
01:09:57 [AB]
i just found an interesting example here. 1 2 3 4 4 transpose. That's just combining the last two axis. That could have been written as.
01:10:06 [ML]
Yeah, so that's 0 0 or 1 1 transpose with rank one.
01:10:10 [AB]
Rank 2.
01:10:12 [ML]
So it's 1 1, yeah, yeah, it's 1 1 jot transpose rank 2.
01:10:16 [AB]
But then there's a real example 0 1 2 3 5 4. Look at that. That's just transposing the last two axis, right, so?
01:10:25 [ML]
Which is transpose rank two, yeah.
01:10:28 [CH]
Uhm, this this episode so as you got it like I am now the listener effectively, I'm no longer the host and like all I'm hearing is, well, we got a couple examples here we got 1 0 2 transpose that's actually in BQN and then we got. That's just a 1 you can drop it down now. We got a 0 1 2 3 4 4 dyadic transpose, so now that is that, that's a.
01:10:46 [ST]
It's perfect, isn't it?
01:10:47 [AB]
Just one thing this mess want to hear.
01:10:47 [CH]
Good example we got there. That's a classic, classic.
01:10:49 [AB]
Isn't it? I I found when I found one.
01:10:52 [CH]
Look this one. This one we got a 2 3 1 4 4 that's actually classic, I remember that one from 2017. And I'm just sitting here being like what what has happened to my favorite language like we used to work linearly. You know you do one operation, then another one, and now I just feel like I'm falling like you know, Doctor Strange when he goes into the glass thing and his thing and he's falling and things are breaking. I'm just like we're in the mirror dimension and and Adám and Marshall are just going back and forth speaking this different language. Oh yeah, well no, not yet and J yeah, J doesn't have that. Oh look, it's great. I'm just like you guys are literally speaking a different language now.
01:11:29 [AB]
No, I mean so. I mean, you can go make this surgery, we'll we'll put a link to the search [49] and then you can see that there are actually some uses of it, but it's it's pretty rare.
01:11:37 [ML]
It's true, point is that when you write out all these numbers, it's like how do you understand that you have to read all these numbers and figure out what they're saying. If you say transpose rank 2 you know what that does, right, yeah maybe sure Adám knows what that does.
01:11:52 [AB]
I mean then then I mean you have a connection matrices and you're transposing all those matrices that's pretty straightforward, I would say and, but there are some i mean, there's something here is Sudoku solver, it seems 1 3 2 4 transpose, so that's like that's that's interleaving the axis. That's a real that's a real hardcore usage of transpose, where you're.
01:12:14 [ML]
Well, it swaps the two middle axes. But I I don't have an easy way to write that. I mean I guess
01:12:21
No you could
01:12:22 [ML]
use one transpose cells in BQN so
01:12:23 [AB]
you do: 1 3 2 4 transpose
01:12:25 [ML]
there we go.
01:12:28 [AB]
But to answer your question Conor. It is pretty rare, right? It is master class type things and it's maybe even indicative possibly of you having structured your data wrong to begin with. And ideally you want to avoid this kind of transposing.
01:12:46 [CH]
And I feel like I feel like you just mentioned there casually that that that's why we have uh, you can combine rank with transpose, which is why we haven't even really talked about it. We've talked about transposing before and after AKA doing rank operations like under transpose, but but we never really mentioned the fact that you can transpose and you can modify transpose with rank which I guess for the example that you pointed out, wouldn't work. Uhm, but for certain cases it could.
01:13:17 [BT]
Well, if you if you if you think about it, you say you have 2 3 4 right? And you you you apply apply rank to transpose. In that case, you're just going to transpose the two tables. They're not going to affect each other, they're just going to be transposed themselves. That's all they're doing there.
01:13:35 [ST]
Well, this is this is maybe too big an intervention to handle this in this episode but I did i did write a paper some years ago about an application that was working with rank 8 arrays and using direct transpose to as an alternative to what in another implementation would be non trivial or a lot of SQL queries. [50] So a rank eight array could be represented as a flat table with a lot of index columns, and so in APL you can go and say I want this index on that, axis this index on that axis, and this index on another axis, and it's kind of equivalent to doing a select where these index columns indicate where you are. But the the challenge the the challenge was was to keep the application logic readable. With all all these indexes and axis and I I just kind of bookmark it here and leave the the the URL for the paper in the show notes.
01:14:50 [CH]
Yes, it sounds like there's definitely is some utility in this stuff for sure. It's just literally mind bending, so right now.
01:14:58 [AB]
They've been.
01:14:59 [ML]
Which is worse?
01:15:00 [CH]
Yeah, I'll say the same thing. As I said at the end of the last episode is that I've made it a solid 2 1/2 years without not only understanding dyadic transpose or even transpose on anything higher than a matrix or rank 2 array. Uhm, I can't even think of although mind you, I'm not writing production code you know six or eight hours a day in APL. I can't really think of a time now having this, you know 20% in my tool belt, 'cause let's be honest, I'd you know gonna have to listen to this episode a couple times and then play with it a lot more. To really fully understand this, I can't really think of a time that for simple problems you really need to reach for this it's it's probably.
01:15:44 [BT]
Well, your your last ADSP, though that's exactly what Bryce was talking about. Is if you do a transpose, you can break your your the things you're working on up in different ways and be more effective when you try and go parallel.
01:15:59 [ML]
Yeah, but generally you would just you just do one like monadic type transpose. Where you're swapping 2 axes. So yeah, dyadic transpose. So if yeah, if you're really working on those rank 8 arrays and your data is 8 dimensional, then you're probably going to need some pretty complicated transposes at some point, but. I mean most applications work with rank one or two arrays and even just working with rank one is often fine so. In a lot of cases you don't need anything complicated.
01:16:35 [CH]
Yeah, my mental model is, you know most of the things are just sort of data flow operations. One input, one output. There's certain ones like outer product and reshape that can explode or add some dimensions and reductions, reduced dimensions. But like this is a whole new, a dyadic transpose with you know a three element array or four element array. OK. That's doing something that, like completely, is going to if you're going through visualizing oh, what's my intermediate state? And then you hit that one. It's just like, whoo, you flipped upside down or or now you got your head underwater. Yeah, definitely it's a what did someone refer to it as a master class? The glyph.
01:17:17 [BT]
I, I think once again we should congratulate the listener.
01:17:26 [ST]
The the one who's still with us.
01:17:29 [AB]
Yeah, congratulations on making this far don't worry if it's a bit much because lots of people have been successful. APL programmers and used APL or other languages a lot and haven't really done a whole lot of dyadic transposing.
01:17:44 [CH]
Alright, with that I guess we will tell the little first once again thank you Marshall for coming on, as always these conversations are great and and with that we will say happy array programming.
01:17:57 [All]
Happy array programming.