Transcript

Transcript prepared by Bob Therriault, Adám Brudzewsky, Sanjay Cherian and Igor Kim.
[ ] reference numbers refer to Show Notes

00:00:00 [Henry Rich]

The big advantage of this to me is it's just so easy to create a multi task application. If you have an array and you want to spawn one thread like you'd like to apply the rake operator which you'd like to use a different thread for each item, then just create a task with a rank of whatever the item rank is and it'll automatically run in multiple threads. You don't have to do anything more than that.

00:00:23 [Music Theme]

00:00:33 [Conor Hoekstra]

Welcome to another episode of ArrayCast. I'm your host Conor and today with us we have a repeat guest. I think setting the record for the third appearance on ArrayCast, but before we introduce him, we will go around and do brief introductions from our three panelists today, we'll start with Bob, then go to Adám and then go to Marshall.

00:00:51 [Bob Therriault]

I'm Bob Therriault and appropriately, today I am a J enthusiast.

00:00:54 [Adám Brudzewsky]

I'm Adám Brudzewsky I definitely am intrigued by J, but I do APL.

00:01:00 [Marshall Lochbaum]

I'm Marshall Lochbaum. I started out with J. I worked for a Dyalog for a while and now I developed BQN.

00:01:06 [CH]

And as mentioned before, my name is Conor. I'm a research scientist at NVIDIA and programming language polyglot, but I have a huge passion and enthusiasm for all array languages. And with that said, I think we've got a couple announcements. We'll go one from Bob and then the last one's from Adám.

00:01:22 [BT]

I think my announcement is a podcast called Technium [01] that I hadn't heard before, and now on YouTube as well. Two guys and they're pretty well informed about programming and developing and things like that, but they take a look at APL and I think of some kind of as the ChatGPT of looking at APL because it's an outsider's view and they've done some research, but they do get some things wrong. So don't take everything they do for gospel. They actually do mention this podcast, but in a way that I'm not sure whether I take as a compliment because they say that just shows you how far out these guys get ... it's kind of niche.

00:01:59 [AB]

I mean, interestingly the title of the episode is factually wrong. You need a special keyboard to code in this language: APL

00:02:09 [CH]

Ohh yeah, I don't use a special keyboard for APL. That probably is like the number one comment I get on videos and stuff for, like tweets is, just how do you even type this? And I'm just like, there is literally like half the population of the world tops or like types and texts in a non like Arabic based alphabet Arabic, I pronounce that word.

00:02:29 [AB]

Well, non-latin.

00:02:30 [CH]

Non Latin. Right.

00:02:32 [ML]

Well, it's also true for Arabic, yeah.

00:02:36 [CH]

And it's like, you know, there's how many people that type, you know Mandarin Chinese, I don't know. close to like a billion and 1/2 or 2 billion. And anyways, it's just sort of comical that there's this very western sort of point of view. That like, Oh my God, if it's not A through Z, it must be so difficult to type and it's it's really not the case well, and of course...

00:02:53 [ML]

The basic characters such as the ampersand and at sign, which you know everybody should take for granted.

00:02:57 [BT]

Well, and just to give you an example for this podcast, they keep referring to "Notation as a Tool for Thought". And so it's yeah, I mean it, it's a small thing, but it's those kind of things, they're just missing by a bit.

00:03:09 [AB]

Actually, this is there's a Canadian alphabet for Canadian Aboriginal Symbols [02] looks a lot like BQN really.

00:03:21 [ML]

Support is not good enough. I would use some of those characters if I could and also I don't like to use scripts because fonts fonts render scripts differently from symbols. A lot of the time like they they handle the strokes in a different way so.

00:03:34 [AB]

Sure looks like BQN.

00:03:35 [CH]

Though alright, my list of things to do is going to be listen to this episode and maybe we'll have like a we'll respond. Maybe we should all listen to it for next time and we'll do a little mini response. So anyways let's go to Adám for the final announcement.

00:03:48 [AB]

Yeah, the APL show with Richard Park, we've released another episode. [03] Yeah, go listen. We're talking about something, something really exciting, which is like primitives.

00:03:59 [CH]

Yeah, I just listened yesterday. I have to be honest. I still don't know if I understand the difference between structural under and what's the other one mathematical under? [04]

00:04:08 [AB]

Mathematical computational under maybe we should do an episode just on that.

00:04:12 [CH]

Yeah, I think actually, I don't know. I listened to it. It was great. Structural under is fantastic because BQN has it. I assume J, I guess we can actually. Wait. Perfect, perfect way to introduce our guest here. So our guest, if you haven't deduced by the number of times he's appeared as Henry Rich, he was on episode 6, so our actually our first guest. Because even though it was episode 6, the first five, we didn't have any guests. So our first guest and he was also on episode 18, I believe, where he was doing something similar because he came on to update us on sort of J. So if you haven't listened to either of those episodes, Henry, which is, I don't know if you wanna call him the resident sort of keeper of the J source code and sort of took over a lot of the work and has done a ton of work over the I don't know how many past years. So definitely go back if you haven't listened to that episode. Go listen to episode 6, then episode 18 if you want as well and then come back, unpause this episode and you'll be a lot more info armed and I think today we're primarily going to be talking about not J 904 because the last time he was on, we were talking about 903 we're gonna be talking about J-94, which is maybe where we should start, although I realized I was asking something about J and then I said that's a perfect time to bring Henry on and then I completely forgot...

00:05:29 [HR]

Structural Under.

00:05:30 [CH]

Oh, right under. So I'm not sure if you want to introduce 9.4 or talk about under for a sec. First kind of messed that up, but you go ahead.

00:05:36 [HR]

Well, I don't know what structure under is, so my my contribution to that is pretty quick.

00:05:39 [ML]

Skip over that one.

00:05:41 [HR]

Is that something I should do, Marshall?

00:05:43 [ML]

The thing with both J and Dyalog is that they've done some stuff with mathematical under that does not quite jive with structural unders. So it's much more difficult to do it in J than in a new language.

00:05:53 [AB]

It that's only computational mathematical under, but it's it's time to add ampersand, colon, colon.

00:06:01 [ML]

Yeah, that's not a terrible idea.

00:06:03 [HR]

Well, it could be if I if I still don't have any idea what structural under does, so I you know, I can't really talk about it.

00:06:10 [CH]

Well, we've talked about under a little bit here, so maybe we should just, we should take a couple of minutes if for those that aren't listeners to APL, show what does the under in J currently do the the computational under.

00:06:21 [HR]

The concept of under is it's a change of point of view. You have two verbs, U&V, and you're. You're given an argument and it's. It's like a similarity transformation in physics, you apply V to the argument, which presumably converts the argument into some space that you can understand. So you apply V, then you apply you, and then you apply the inverse of V. To the result of you which takes the result back into the original space of the argument. The the most common usage is under open or open unboxes an argument, so I haven't. I have some argument I wanna take the box away, do something to it and put the box back. That's under open like the concept applies to to any.

00:07:16 [CH]

Right.

00:07:22 [HR]

Any verb it might be a matrix transformation or basically anything that has an inverse is applicable as the V or argument.

00:07:32 [BT]

Take taking it into the real world, I always think of anesthesia and operation. You put the person to sleep, you do the operation and then you wake them up. You want to do that in the right order.

00:07:45 [CH]

That is an example that came up on the APL show episode.

00:07:48 [ML]

I think that's one Iverson said he used too. I know Roger promoted that too, but I think it was Ken. Originally Ken Iverson and.

00:07:56 [CH]

Is there an explanation from either Marshall or Adám, in 60 seconds or less?

00:07:59 [ML]

I'll go for it.

00:08:01 [CH]

Explanation of the difference between structural under and computational slash mathematical under. If not, we can just kick it to another episode.

00:08:10 [ML]

So the geometrical transformation perspective on on computation wonder works really well for structural under too. The idea with structural under is you still have two functions U&V&V is this transformation. But here V is a structural function, so it's not allowed to like add numbers or anything. It just pulls out part of the argument, so one thing actually opened is an example like you might say under first that would be the kind of structural version of the under open thing. So you pull out the first element and then you apply to that whatever the result of V is, that part of the argument. Or maybe it's reversed or transposed or shuffled around in some way and then structural under puts that back so the big difference with structural under is that it works even if the loses information, so this under first thing you just take out the first element of your argument that loses all the other elements, so you can't possibly invert that function. But what structural under does is to kind of put the results back where they were in the original argument, so it remembers that information and it gets it back.

00:09:23 [CH]

So does this mean that structural under is a superset of computational under.

00:09:30 [ML]

Not really. There is an intersection that I call invertible under, so if V is a strictly invertible function like minus or reverse or something like that, where you can always undo it exactly. Actually under reverse is both a structural and a computational under, so it's invertible. They both do the same thing, but computational under also does some stuff like, yeah, at least in, in J and APL and BQN. Sometimes the inverse is not always exact, so it'll do things like if you want to invert the square function, which is really nice because you can take, you can take like the magnitude of a vector with summing under the square. So do you square sum square root it has two options for the inverse of the square there's the positive and the negative square root, or generally one number and it's negative, so it chooses one of those. And structural under never does that. It never makes that choice. So in that way computational under is not exactly compatible.

00:10:37 [CH]

Henry, you're going to say.

00:10:39 [HR]

Yeah, in in J we can emulate structure on there for for the take the case that Marshall mentioned that this operate on the first. The way we do it is we define a verb to use for V, the verb itself selects the first argument, the first item, and it's inverse would be defined as storing into the first item, whereas taking the first item itself doesn't really have an inverse. If we know that what we're trying to do is modify the first item, we can create a verb that will have a defined inverse, and that would would provide exactly the structural feature that Marshall was talking about.

00:11:26 [AB]

How how would that work? I understand it's lossy as soon as you take the first element, you've lost information about the rest of the array.

00:11:34 [HR]

Ohh, OK I see what you mean.

00:11:37 [AB]

And another example would be under raise or enlist or revel or any anything that destroys the structure.

00:11:45 [HR]

Yeah, you're right. You'd have to make it more complete. You'd have to verb would have to carry the argument along. OK, I take back everything I said.

00:11:56 [AB]

There's an adverb in in J called amend, right? Or something like that.

00:12:07 [ML]

And so that allows you to put elements into an array, but the thing about structural under that it doesn't do is, structural under works and like all levels and stuff, so you can take the first element of the first element you can take the first element of each.

00:12:20 [AB]

So it's a superset of amend? [05] Yeah, right, it can not only dig into any part in any structural transformation of the main argument, it also has the ability to apply a function there rather than just replace elements there, so amend this as some...

00:12:36 [HR]

Well, amend does that. Amend has a gerund form where you can specify the selector and the modifier and and the thing to be modified. So yeah, for the particular case of first, but you could use that amount, but that that would not be a general. It wouldn't generalize to things other than just amending and the right.

00:12:57 [AB]

And that, like APL, has also an ad hoc operator and the structure under is a very powerful super set of of the add operator adder, but it can do exactly two things that the under can do, but it could do so much more.

00:13:13 [ML]

Well, there are more than one thing it can do with the select half of that like it can. It's really hard to figure out how to do it, but it can do reach indexing and stuff like that.

00:13:24 [AB]

Yeah, OK, but it can do that operator can either index into an array and make changes there, either by with function application or substitution, or it can apply a Boolean mask to an array and use that to either do substitution or function application, but that's it and under can take an upper end that selects or masks, in which case it has the same functionality, but it can do anything it could do have an operand that does all kinds of things, so you can you can you can use a composite upper end that that reverses and drop some elements and then take some elements and then select the third one of those and then changes something and then it all on the rolls back to where it came from.

00:14:09 [CH]

Alright, maybe we should do a whole episode on under and amend in J. Gotta say, you know, AT in Dyalog, APL is something that I find extremely not ergonomic and anytime I try to use it the first way I try, you know it doesn't work and I have to put something in a dfun or anyways? But it is one of the most I think, you know, when you talk of we mentioned tool of a thought tool for thought. Now I'm saying it wrong. A tool of thought. Geez Bob, we blame Technium. That's right, Marshall, get over there and get rid of those guys.

00:14:46 [BT]

I hope those guys listen to this episode.

00:14:48 [CH]

Yeah. Ohh man. I hope that would be great. I had some. There's a guy on YouTube called Primagen who's like a Rust, you know, Rust, YouTube channel and and he twitch streams. Someone just sent me a video 2 days ago or yesterday of him watching and like critiquing one of my videos and he's a bit crude, the jokes sometimes he makes, you know, maybe not suitable for work, but he hates C++, and I was just killing myself laughing. So, like if anyone out there wants to do, you know, a review of these niche developers that are, you know, often, you know cuckoo land. I find that stuff. I don't take it to heart, you know, someone said, oh, you're you're gonna be upset when you watch this and like, 3 minutes in I was, like, pausing the video and like killing myself laughing anyways. Alright, let's let's get to the actual topic that we brought Henry to talk about today, which is the new release of J94 or 9.4, maybe we'll start there and you can talk to us about this switch up, if you will, numbering we went, we went from 903 to 9.4.

00:15:54 [HR]

Oh, the numbering. Well, J started out J1 J2 J3 UP to J7. [06] That was 25 years ago and then it reset to J 2.01, I think or 201 and continued on with that numbering scheme up until 903 and now Rick Sherlock has done the work to let the Windows procedure winget, I think it is, I'm I don't I don't actually know, but it's something distributed from Microsoft that will let the Windows executables be distributed by a trusted source. So that's good and they use a different numbering system, so we decided we would switch to that. So the new release is now J9.4.1. The betas are nine point 4.0 and so it's it's being released under that numbering system, so excuse me, J904 and that's J9.4.1. 9.4 is a major release and now this is one of the biggest releases we've had in a long time, as far as functionality. And with this release finally J becomes multithreaded with full support for multithreading. This owes a lot to Elijah Stone, who did a great deal of the work, but the idea is, you have a verb that will create a thread. You create as many threads as you want. Usually one thread per core is about right. Yeah, you might have an application where, say, you're listening on a a whole bunch of sockets. You might want to have one thread per socket. That's really not a very efficient way to do it, but it so would save you from rewriting your socket code, create the thread once you got the thread, then you can use it. By you take your verb, let's call it U. And there's there's a new conjunction, T dot, and you say U T dot [07] whatever they are is 0 and it just runs your verb as a task runs your verb a thread. You don't have to do anything beyond that. The result will come back in a box. And when you look at it, you'll see the result. And the result is always there. It's magic of course. What happens is if you're spawning task looks at the result before it's actually finished the task will block until the result is available, but you can spawn as many tasks as you like and then wait for them later. You can wait for them. You can probe them to see if they have a result yet and avoid having your main thread wait. The big advantage to this to me is it's just so easy to create a multi task application, like if you if you have an array and you want to spawn one thread like you'd like to apply the rank operator, but you'd like to use a different thread for each item, then just create a task with a rank of whatever the item rank is. And it'll automatically run in multiple threads. You don't have to do anything more than that. So, if you wonder whether your application could benefit from multithreading. It's really easy to find out. Just try it with tests and see if you get a performance increase.

00:19:41 [CH]

So this means. It sounds like all of the multithreading is on the user to encode. You're not going to expect, you know, some heuristic of we're doing some reduction on some massive array and the behind the scenes, knows it'll be more efficient if it launches a few threads nothing like that will happen, but if you want to sort of program it in or test something out or write the multi threaded reduction that's something that is going to be quite easy for the J developer now with 9.4.1 at least.

00:20:12 [HR]

Well, that's true. It would be very easy, but it's not true that threads are only used when these are asked for them for verbs where we know for a fact that threading is going to, multi threading is going to be advantageous and the prime example of that is matrix multiplication. If we know thread is going to be advantage, uh, we will so so to take that example matrix multiplication, we automatically run the matrix multiplication on as many threads as you have defined you get a little deeper. You can define assign threads defined threads in thread pools and thread pool 0 is the default pool and it's the one that the system will use to do things that if things will be better done in threads. So far, matrix multiplication [08] and some custom functions that we've written are the only ones that we've automatically tested, but reduction would certainly be a a good candidate for that, and maybe sorting too. Maybe as we identify those cases, we'll call them up so they'll multithread automatically, I said we already do that for matrix.

00:21:32 [AB]

Does that include the matrix division and inversion?

00:21:35 [HR]

No, just just matrix multiply, although well.

00:21:39 [ML]

So division uses matrix multiply under the covers, doesn't it.

00:21:42 [HR]

Yeah, and and and the J implementation division is done by repetitive multiplication so that would speed up. Although really late LAPACK is faster for inversion, but if you have had to invert a rational matrix or something you would benefit from the multithreading for that. There are a lot of things that don't benefit from multithreading you to to get benefit from multithreading you need to have a high proportion of computation to data movement in and out if like adding two vectors it's not gonna help to try to do that in multiple threads. Reduction, on the other hand, you probably could benefit from doing doing multiple threads and we'll get to that.

00:22:31 [AB]

But that's dangerous, right? Because if you're doing reduction, that's computational, say, plus reduction, then you will get different results.

00:22:40 [HR]

Well, that's true. You have to, you have to be willing to accept irregular floating point round off. Then that's true even if you're adding in a single thread, right? I mean, when you you decide I'll use SVX2 instructions and they add in with four accumulators at a time. Well, now you're going to get different results, than if you had one at a time. An AVX 512 is going to get still different results. if you really care about the accuracy reduction we offer a high precision reduction that doesn't, that has predictable round off and operating system, that had essentially quad precision for the accumulation, but I think most most users are content to let it round off as the machine chooses to. An important feature of our threading implementation is that all the threads share the same J namespace. So, I have seen languages that said well we'll support multi threading but the the address spaces of all the threads are separate, you know they can't talk to each other. That's supporting multi multi threading that's true but, what the user really wants is to be able to apply all the threads on the same problem at the same time and to do that, you really need to have the threads share the same space for names and variables, globals, verbs, everything. So we do that. So that's one of the things that makes it easy to try out an application, see how it benefits from threading, cause you just say, well, here's my verb running in multiple tasks. And that verb will be automatically shared.

00:24:27 [AB]

So how does that even work and since these are OS threads, not not green threads so I thought every process of thread had to have its own memory and how can they even have it?

00:24:40 [HR]

Now they share memory space.

00:24:42 [AB]

It doesn't mean you could have two parallel threads that that that both modify a global variable and they can see each other's changes in real time, yes.

00:24:53 [HR]

That, that's good and bad, right?

00:24:57 [ML]

So there are performance effects. If you're hitting the same memory, but the processor does make sure you get the you get consistent results.

00:25:05 [HR]

Well, the processor doesn't, I have to. No the program because it was a quite a rewrite of the way the interpreter deals with names. Well, I mean the thing is when you have multiple threads and they can all modify the same names, you realize that your concept of a symbol table, it almost goes by the board. Like thread A may read a name and thread B and the next nanosecond erases that name.

00:25:39 [ML]

Yeah, I didn't mean like high level consistency. I know you have to put that in, but I meant that this the multiple CPUs can all access the same memory and it's nothing crazy and undefined happens.

00:25:52 [HR]

Right. but if it was very subtle to make sure that that the program still work even when that names can be erased instantly.

00:26:02 [AB]

Or the values change, right? What happens?

00:26:03 [HR]

In another value doesn't change if thread A accesses variable X and thread B immediately writes to access thread A will continue to run using the value that was in X.

00:26:18 [AB]

Oh well. The point is here that the value has its own existence, separate from the symbol.

00:26:25 [HR]

So A will continue to run B will overwrite X and you could say probably you're in for something bad to happen there, but but not necessarily. I mean like, let's say the variable maybe I'm doing some some alpha beta pruning in a game. You know, I'm trying to find, you know the best results so far and X might just be the best value that any thread has found and it's I'm content with using an old value as long as the value correctly represents the current state of the game that you need. You need semantics above the level of the hardware as Marshall was saying. Yeah, the hardware will make sure the hardware and the the interpreter will make sure that you don't go off and access undefined memory, but you need a a level of semantics above to make sure that if two threads are modifying a A shared variable, they need to be able to lock it so that the right one modifies. And so there's a set of of concurrency primitives that allow that semantics for the threads to be accessed at the same memory, but where it's really easy when it makes makes it easy. I have tried in several occasions to run multitasked programs in J by starting a new instance of the J interpreter and it's always just a pain to share all the data you have to send the data to and from some central task to the other task it operates on it. Comes back. So, somehow there has to be something that knows what is everything that has to be sent from task A to task B and what has to be gathered as a result. And all that goes away. You just start task A, B and C and they look at whatever global names they need to look at. They get the current values and as long as they're not modified, they don't have to worry about it. If they modify something maybe they need to worry about it, but it's really makes it easy to start a multithreaded application.

00:28:34 [BT]

And and the sending of that information back and forth, that takes up a lot of time and energy, right, that's.

00:28:40 [HR]

Well, it takes some time. Yeah, it takes time to to copy words from one thread to another. What happens is you share the computation among tasks A, B,&C and each one of them finishes their result and they they put the result in a box and send the the box back. So, what returns into the result area, the results called a pyx. [09] That's a word for a special kind of box. And the pyx is the magic box. Whenever you open it, it's got what you're looking for in it. But by waiting until the the results there, the tests return their pyxes which point to the data. OK, so now let's say the main thread wants to look at the values, so it opens a pyx. It's, ah, there it is. There's my data. And here's the pointer to it. So far, there hasn't been anything. Nothing has slowed down, but when the the thread wants to go look and use that data, then it's over in another task, it's in a different core. It's in Level 3 cache somewhere and has to be brought back in as if from Level 3 cache. And that's where the time is spent, so that there's there's time spent transferring the data to the task and transferring it back, but that's that's only that overhead is only as big as the arguments and the result. So the the key to making a good task definition is do enough computation on the data that the computation takes more time than the transfer if you do something that touches the data a dozen times in the thread like say, sort it maybe it'll be a good trade off. Having the data operated on in fast cache in another task will be worth the data transfer not adding two vectors that would be dominated by the data transfer.

00:30:44 [BT]

And that's because the work that you do before the pyx is provided or the information comes to the pyx (the more work that's done there), the less transfers you have to do back and forth between pyxes.

00:30:55 [HR]

Well, the more work you've done, you get the advantage of having multiple processes, right?

00:31:03 [BT]

Right.

00:31:03 [HR]

If I do a hundred things, or if I have thousand things to do, and I can split them among ten processors, great! That's only 100 per processor and if the data transfer time amounts to only one or two, that's a good trade. I'll get the job done in 110 units of work instead of 1000.

00:31:26 [AB]

But the problem is that the processors today are so fast compared to the memory which is so relatively slow that a lot of the work that we want to do happens at memory write speed and memory read speed so that you can never get any speed up by sending the data somewhere else because you need the data back again and that transfer takes the same time as just computing it over here, right?

00:31:50 [ML]

So one advantage of using multiple cores is that each core has its own cache and that actually speeds up your memory usage, if you can keep each core in its own contained space that's running on its own. I think that's actually one of the really big advantages of splitting stuff across cores.

00:32:05 [HR]

Yes, well and there's another ... [sentence left incomplete]. This is something cannot find much information on so, maybe some of you guys know. Take an Intel processor [and its] level 3 cache. [10] There's a little bit of level 3 cache with every core, so the level 1 [and] level 2 are directly tied to the core, but the level 3 caches in effect are in parallel, so that the the bandwidth to level 3 cache is pretty high. The level 3 cache is itself fairly slow and there's a limit to how much one core can get from it, but multiple cores could get the same level of bandwidth each that the individual core can get because those caches are actually running in parallel even though, you know, the the data is shared among the different cores. Each cache [has] pretty fast access to it and in that case, there is a benefit even if you have to transfer the data through level 3 cache. There's a benefit from having multiple cores working.

00:33:16 [BT]

Because they can all be doing it at the same time.

00:33:17 [HR]

Yeah, because the cores work in parallel. The memory held by level 3 is not contiguous. The addresses are permuted so that any access to level 3 essentially goes to a random level 3 block, allowing many level 3 blocks at a time to share data over the ring interconnect. So the CPU designers have helped us out here. I think multithreading really is going to be an advantage for a lot of things. You just have to I think ... [sentence left incomplete]. Usually you want to have an explicit verb that is several lines long. Run that in parallel or a large test function, but something that accesses the data a few dozen times. That's going to benefit from being in a task.

00:34:09 [BT]

And working in J, do I have to do anything to kick that in? Like do I have to specify how many threads I want to work first or is it just going to kick in automatically?

00:34:17 [HR]

Yeah, we've deliberated over whether we should make this automatic for the user, but the idea would be somewhere in your startup file you would say: "I wanna create 10 threads". And you would create the 10 threads. And from that point on, they're available for use of tasks.

00:34:35 [BT]

And you were saying a thread per core is probably optimal. Is there any particular reason for that?

00:34:40 [HR]

Well. Most J primitives can use all the bandwidth available to a single core. So there's not a whole lot of advantage in having two of them running on a core. It might be possible, but they're gonna have less cache each.

00:35:00 [ML]

You're talking about hyperthreading here?

00:35:02 [HR]

No, I don't. I haven't measured it, but I don't see ... [sentence left incomplete]. Like if you have two vectors I will take all the bandwidth that the caches can provide with the single core. I just don't see the hyperthreading is gonna help much.

00:35:21 [ML]

Yeah

00:35:21 [HR]

And I don't do very many mispredicted branches. I guess we can say that's a topic for exploration. My best guess says that one task for core is gonna be the right number. It may, it will depend on the task [to] some degree.

00:35:37 [ML]

Well, we don't think about it too much inside the array community, but I'd like to ask about the impact of using immutable arrays instead of mutable arrays has on this? [11]

00:35:44 [HR]

I don't have any immutable arrays really. Once an array has been referred to twice, I guess you could say then it's immutable. Is that what you're thinking of?

00:35:56 [ML]

Well, the semantics are immutable.

00:35:59 [HR]

Well ...

00:35:59 [AB]

All the array languages are like that, right? Everything is passed by value, so everything appears to be immutable. You can just create a new array that's similar or identical to the old one, but underneath the covers, every implementation keeps track of reference counting.

00:36:19 [ML]

Well, and yeah, I mean the only way to know is to have another copy of that array lying around and see if it changes. And given that it doesn't, the eraser, I would say that's immutable semantics for arrays.

00:36:33 [HR]

Yes, yes, immutable semantics, but it's very important not to actually implement immutable arrays. I have come to some trouble ... when two arguments go into a primitive, they're not referred to more than once. I'll delete them as soon as the primitive completes. You need to do that to share the memory so that you keep the cache footprint low.

00:37:00 [ML]

Yeah

00:37:01 [HR]

And also you need to implement ... [sentence left incomplete]. If you're adding two arrays ... if you can add them in place, that's much faster. Two address addition is much faster than three address addition. So there's a process involved in deciding when it's OK to reuse one of these semantically immutable arrays.

00:37:24 [ML]

Yeah, true. I mean, but on the other hand, you're not dealing with the situation where two threads could simultaneously modify the same array. That seems like a pretty big deal to me.

00:37:36 [HR]

You're saying that they cannot modify the same array? Yes, that's true.

00:37:44 [ML]

Yeah

00:37:43 [HR]

They can in a limited sense if they are passed ... [sentence left incomplete]. If the arrays you're talking about are pieces of an array. Like you have to execute different tasks on different sections using the rank operator, so you create a task for every section. Then the individual sections would be recognized as being reusable, but in general, you're right.

00:38:08 [ML]

So this is if the original array is cut into sections and then not referenced, delete I guess, yeah.

00:38:17 [HR]

Right, that happens a lot though, that you have just a sentence. The result of every verb is not referenced unless it's assigned to something.

00:38:28 [AB]

Or further used in the expression.

00:38:31 [ML]

It stays at ref-count one unless there's an assignment, yeah.

00:38:35 [HR]

Yeah, it stays at ref-count one. So if it goes into a fork I can't modify it on the right side, but I can modify it on the left side if I know that ... [sentence left incomplete]

00:38:45 [AB]

No, because you were not going back towards the right.

00:38:48 [HR]

We're not going back to right.

00:38:48 [HR]

So keep track of all that and retire the arrays as soon as they're not needed so that their cache space to be brought back. Yeah, J has done that for a long time. But in these tasks, I suspect that the main thing is going to be explicit definitions that run on local variables and those are not ... [sentence left incomplete]. Local names in J are not visible to anything else and there the tasks will be completely separate, and reading from a shared global ... there's no problem with that. If two threads wanted to write to the same global array, that's a problem. That's not a problem ... [sentence left incomplete]

00:39:33 [ML]

It's a performance problem [chuckles]

00:39:35 [HR]

It's a performance issue, right. [Thinking aloud: So how would that work?] Well, actually if they lock the array before they modified it, each one of them would be able to operate on it in place. So that, yeah, two tasks needed to do that would need to use the locking primitives to guarantee that there's no simultaneous access to the name. And in that case they they would not have to copy the array.

00:40:04 [AB]

So so these pyxs, they futures, right?

00:40:09 [HR]

Yeah

00:40:10 [AB]

Basically a promise that in here there will be a value in the future at some point.

00:40:18 [HR]

Right

00:40:19 [AB]

But if I understand it right, they are boxes. So you said that in order to test your code with parallelism on the computation, you just stick that T dot in it . But this actually changes the result value right? Because now everything you get back is boxed and you need to unbox it.

00:40:39 [HR]

Yeah, you have to unbox it. The T dot operator is defined as applying you and then boxing the results. So if you want to be compatible, you would arrange to have your original verb produce a boxed result and open it. That's cheap.

00:40:57 [AB]

So it's equivalent to ... [sentence left incomplete]

00:40:59 [HR]

Box atop U . Yes.

00:41:02 [AB]

Box. But running ... [sentence left incomplete]. Boxes up whatever you're doing.

00:41:06 [HR]

Yeah. By having the futures in a pyx, you can pass the pyx to another verb. You can do anything with it that you would do with any other box. And you don't have to wait for the result until you actually look inside.

00:41:24 [BT]

And pyxs are atomic the same way boxes are right?

00:41:27 [AB]

But they are boxes, right?

00:41:29 [BT]

They are boxes, yeah.

00:41:30 [HR]

Yeah, pyx is a form of a box, right.

00:41:32 [AB]

That must give some interesting effects then. Let's say we have an array, so you're using ... [sentence left incomplete]. I don't know how you read this? The T dot does it have a name?

00:41:41 [HR]

Yes, "task".

00:41:42 [AB]

Execute this task [chuckles]. So if you say my function: task rank something whatever ... rank negative one . You applied it on the major cells. Then you get a list back, a vector.

00:42:02 [HR]

Right a list of pyxs.

00:42:04 [AB]

So the information, the overall information, about the rank of the result is temporarily lost until you do an unbox rank 0, right?

00:42:22 [HR]

Right. You would ... [sentence left incomplete]

00:42:24 [ML]

You can just use unbox.

00:42:25 [AB]

Oh, OK, right.

00:42:26 [HR]

Well, but no I might not ... [sentence left incomplete]. If I expect the results to the individual boxes are conformable, then I could open the whole thing, but I might just want to look at one box and and see what its result is.

00:42:43 [AB]

But yeah, if they're not ... if they don't conform by the time when you open them all up, then ... [sentence left incomplete]

00:42:51 [HR]

I mean, it's not an error unless ... [sentence left incomplete]

00:42:52 [AB]

No, but your padding. I don't know how J works on this thing; if you unbox something ... [sentence left incomplete]

00:42:57 [HR]

Yes, it would pad up, but if one of them was numeric and the other is character you just you take ... [sentence left incomplete]

00:43:02 [AB]

... you get an error at that point.

00:43:02 [HR]

Yeah. I think I've diverted you from your point. So yes, you get a list of pyxes.

00:43:12 [AB]

It's just that's starting ... [sentence left incomplete]. It's something [we've started] thinking about [and] talking about it a Dyalog. We haven't implemented as a primitive yet, but we have models that behave like primitives for running in parallel on OS threads. [12] We already have a primitive running green threads. I don't think anybody has suggested changing the structure like that, which is why running it with rank is not going to work well, because as soon as the result comes back ... [sentence left incomplete]. Well, as soon as we want to pass on the results further then you would have to await everything. Otherwise you cannot know anything about the results. We don't know what rank it has.

00:43:58 [HR]

Well, if you stick it in a box, you don't have that problem. That's why I did that.

00:44:02 [AB]

Exactly, but sticking it into the box effectively changes rank into an "each" even though "each" isn't a primitive in J. But "each" is really just rank zero under ... [sentence left incomplete]

00:44:14 [ML]

Well it is under box rank zero, which in J, you kind of get implicitly.

00:44:20 [HR]

Yeah

00:44:22 [AB]

We've only been doing it so that it has value for each, and that's really what's happening in J as well, except we also do the whole boxing unboxing transparently, whereas in J you'd have to be explicit about the unboxing.

00:44:36 [HR]

Right. I don't see that as a problem. Yeah, that's true.

00:44:41 [AB]

Well, it just means that actually with the proposal we have in Dyalog that you can actually add this parallelism operator and expect everything else to work the same. You can't detect the difference in value.

00:44:54 [ML]

Well, Dyalog also has the problem that because it's got this floating array model, characters and numbers sort of leak out of boxes.

00:45:02 [AB]

Yes, that would, it wouldn't work anyway to try to box up things.

00:45:05 [ML]

That is a significant implementation issue I guess.

00:45:08 [AB]

Yeah

00:45:09 [ML]

It's not impossible to support.

00:45:10 [AB]

No, I'm just contrasting them because we very much see (at Dyalog), other array language implementations as beneficial to us, because they'll go off and try various things out and then we'll be the slow ones to learn the lessons [Henry agrees]. 10-15 years later, we implement something similar but better because [chuckles] we know the mistakes everybody else made.

00:45:32 [HR]

I think we waited a long time for this in J [chuckles]. So we benefited from some other people's mistakes too.

00:45:38 [ML]

Everybody's sitting around waiting for someone else to do something... [everyone laughs]

00:45:42 [AB]

Exactly

00:45:43 [HR]

The one thing we did do (this is Elijah Stones's push) was we used futexes [13] rather than mutexes for the synchronization operator, and that's a subtle difference, but it's worth looking at. It's a pretty clever concept. Wee tried to make the task sharing ... [sentence left incomplete]. A task in this context, [we] should define is an internal ... a job really. A job is an internal task such as matrix multiplication, where the user says: "here's two 10,000 by 10,000 matrices; multiply them". And we'll split that up into 20 or so smaller size multiplications and give them to the threads. The threads then have to contend with each other. The threads will be launched. They'll be woken up. They have to contend for access to the job control block before they can do their work. And so the the amount of overhead that you spend where the threads are fighting over the job control information becomes important. Anyway by using futexes, we have discovered that our internal jobs can be as short as 400 nanoseconds. That's not very long, but even with the jobs as short as 400 nanoseconds for each work unit, the amount of arbitration delay is negligible. So mutexes tend to waste time as each thread has to go to sleep and wake up and grab the mutex and decide what to do. This futex implementations got very low overhead and I'm pleased with that. There's not a lot of processor or operating system overhead involved in switching tasks.

00:47:49 [BT]

So a futex, is that something that would like, when it finishes its task, it looks ahead to see if there's another task rather than shutting down?

00:47:56 [HR]

OK, here's the problem. [The problem] with the mutex is you say: "is there any work to do?". "No". "OK, then I'll wait". That sounds that simple, right? The problem is, what happens if work comes in between the time you look to see if there's work and the time you wait? OK. That's a timing window that you can't close. So with a mutex what you have to do is say: "oh look for work; I don't think there's any work". OK, let me grab the mutex (so I will seize control of the resource) then I'll look again, to see: did something come in. And if not, then I'll go wait. And part of the waiting is released; the system automatically releases [the] mutex. Means that if you have many, many threads (and processors are coming up with hundreds of cores now), if you have lots of threads, every one of them has to grab the mutex before it waits, and that's bad because it takes time.

00:49:08 [HR]

Anyway, the futex idea is really clever. Instead of using a mutex, there's a shared location in memory and I look at that. So when I go to wait I say: "OK, what is that location in memory at now?" OK, when I read it, it was 300. And then my wait operation says: "I'm gonna wait; tell me when the value has changed from 300". And [it's] just so, so clever. So now the operating system, eventually gets control and says: "Oh, he thought it was 300, but some of these advanced it to 301". Whenever you add a unit of work, you increment the futexes, you see? And so there's no window. You don't have to grab the mutex. It's implied in the way the futex works, and it's just more efficient and we can measure the difference in the speed that heavily multi-task jobs take.

00:50:02 [BT]

And that's as a result of all the threads being able to look at the same memory. If they couldn't do that, you wouldn't be able to use the futex, is that right?

00:50:08 [HR]

That's right. That's right. Yeah, but the same thing is true in a mutex system. And yeah, there has to be something shared. The mutex that has to be shared [Bob agrees]. Yes, but all you have to share is a single four byte value and the operating system will avoid the overhead. Which is good because you know it's a one time in a million that somebody actually does add a task between the time you looked and waited. But one in a million, it is pretty often in the computer business. So you have to worry about it.

00:50:43 [BT]

If I was going to use an analogy ... if I said it sort of like a DMV or a hospital waiting room or something where that number is changing on the board, a mutex would be like you have to wake up the patient, point them at the number and then put them to sleep again [Henry chuckles] and a futex is like everybody's looking at that number: "oh, it changed". Right? Is that sort of the way it works?

00:51:08 [HR]

[Hesitatingly] I don't think so. I think that analogy could be made that's like that. But yeah, I don't exactly see how to do it.

00:51:16 [BT]

OK.

00:51:17 [BT]

Well then just ignore what I said there because I just made up an analogy that doesn't work [chuckles].

00:51:21 [HR]

No, but no! But that's a that's a very good analogy. If you come into the butcher shop and you say: "now serving 400". And you take a ticket. Well, with the mutex, I think you take a ticket and then see if [it's] 400. But with a futex, you just look up and see 400, and then you're gonna say: "I'm 400; can you serve me?". And usually the answer is: "yes, I can serve you". But the answer might be "no". Somebody else came in and got 400, before you. You'll have to go back and check again. The point is that the case that has to be guarded against is rather expensive and extremely rare. And the futex avoids the overhead in that situation. Anyway, that's the first thing about this new J release, and probably the most important. Well, I don't know if it's most important, but it's big. It's something we've been thinking about for years. When should we go to multithreading and I think the processor vendors are telling us that now's the time. If a CPU has 100 threads and you can only use 100 cores, you can only use one of them. That's a hard sell to say that you're really taking advantage of the machine.

00:52:38 [HR]

The second big thing in this release is that the J "Extended Precision Math Library" has been switched from an internal one that we used to use, to GMP. [14] Which seems to be one of the leaders in extended precision math, and it's more space efficient, it's faster and (for multiplication) it's way, way faster because it it supports 48 transform multiplication methods which our internal version never did. That was Ronald Miller's work. I had no idea how hard that was gonna be when we proposed it for work, but he stepped in and got it working. So anybody who wants to use extended precision in J will find that it is much faster than it used to be.

00:53:27 [BT]

As an example, in the APL farm (I think it was actually last night) I was watching a discussion and somebody was trying to do a Rosetta code problem that involved extended precision primes and they said: "oh, I got it working". Takes about two or three minutes, and somebody else replied: "really minutes or seconds? Do you mean seconds?". And they said: "no, no, I meant minutes". And the person said: "are you running 904? You're running 903?". "I'm running 903". "Oh, you should switch to 904 because what you just ran in two or three minutes took me under a second and a half".

00:54:08 [HR]

Well, if that's right, and if you're multiplying really big numbers, it could be thousands of times faster. It's great to have that. We don't have to be ashamed of our math library anymore. The third big thing and actually the thing that will probably make the most difference to the average simple user is that the error messages [15] are much, much improved. Before, we'd have to say the error messages really sucked. You know you do something and you get domain error. That's all it is. Or length error. "You did something wrong and but I'm not gonna tell you what two lengths don't match". This is partly because of the way the J language is designed and it might be true for any language that has significant amount of tacit programming. Say the parser encounters a hook. Say it's the hook that calculates an average (plus slash divide tally) right? So three things. The first thing that happens is the parser parses the fork into an anonymous verb, that takes the average. And then the parser launches that verb, and say the verb fails. There's no information in the verb that connects the failing verb to the token number and the word number in the sentence that caused the problem. And remember that with tacit verbs, he could have two dozen primitives in this anonymous verb that's executed by the parser as a unit. So all we've been able to do in the past is say: "we tried to execute this big thing and it failed somewhere and it had a length error". What we do now is we capture enough information at a low level that we can indicate which primitive actually failed, and then we take the arguments to that primitive and send it off to a J function. His job is to analyze this error. So there's there's a J verb; it's about 1000 lines of code and it knows what the different errors from the different primitives mean. And if it sees a length error, it can figure out what parts of the shape we're supposed to match and didn't and call us to your attention. So I think with one stroke, we've gone from having error messages that sucked to error messages that are pretty good. We'll see how the users like it, but it's a big point for usability I think.

00:57:10 [CH]

Does it make use of a lot of the ... well, I shouldn't say newer languages. It is newer languages, but even older languages that are stealing this idea from newer languages, the caret where it's sort of creates like a little visual pointer to like point at the verb as well. Like it's a nicety, but like it's become almost like, yeah, the the new languages that don't have it, people sort of raise their eyebrows and they're like: Have you seen what everybody else is doing?

00:57:38 [AB]

Everybody else. APL had this in 1966...

00:57:43 [CH]

APL?

00:57:46 [HR]

Yeah, we we do it in some cases mostly what we do is. We type the sentence in error and leave three spaces before the actual location of the error. In some cases we print a carret and I you know. If you're telling me that all the cool kids print a carret, I can go do that cause that's that's.

00:58:10 [ML]

I would probably do that cause I've always found that space display confusing. I mean cause like what if there's spaces in the original function, then you have to kind of line up and say where my spaces.

00:58:22 [HR]

In the.

00:58:22 [ML]

It removes spaces from the display right? But it's still. I've still found that confusing in the past.

00:58:28 [HR]

I yeah. OK. Well, I'll put that in for 9.5. Coaches that would be trivial to do it and we do it sometimes. Ohh yeah, one good thing is that you know mismatched parentheses... they'll problem.

00:58:47 [ML]

No, there's.

00:58:50 [HR]

Because you know you if you added a parenthesis and it just totally changes the way the system parses the sentence. So you'll get a spurious error indication. And with no hint. Well, anyway, every... Whenever we have an error like this, whenever we have an error, I scan the sentence to see if it has mismatched parentheses, and if it does, I'll report the error that the parts are found. But I'll also say, and by the way, dude, you got a parenthesis missmatch over here so you know, maybe you'd like to look at that first. So yeah, so we can report 2 errors at once now in a sentence. Where the parenthesis is probably the actual source of the error.

00:59:37 [AB]

Yeah, that that one is fun in Dyalog API. At least you can have mismatched parentheses and the interpreter will happily start executing then the expression and that might have effects before it hits the place where it just can't continue anymore.

00:59:55 [HR]

Yeah, indeed.

00:59:56 [AB]

So you can have and partially OK partially evaluated.

00:59:57 [HR]

Same thing in J.

01:00:01 [AB]

Expression that has mismatched parentheses.

01:00:03 [HR]

Yeah, well, you know, I was thinking, why shouldn't we scan the sentence from mismatched parentheses when the explicit definition is defined? Can it be right to have a I guess the...

01:00:14 [AB]

Maybe you leave the execution before you even hit that parenthesis.

01:00:18 [HR]

Well, I just I failed the definition. I just tried to define a verb that contains a sentence with... Well, you know it could be. That would a...

01:00:28 [ML]

Be I can't imagine anybody complaining too much about their function that had mismatched parentheses, but it exited first, so they've never noticed the issue.

01:00:39 [CH]

Welcome to the Python experience.

01:00:40 [AB]

Yeah, I've, I've.

01:00:41 [CH]

Seen some code and it doesn't work and you know well, we never executed that branch in my unit test, so we're. Gonna call it a day.

01:00:49 [ML]

But once you get there, it's awful easy to fix, especially if the interpreter is pointing out where it is.

01:00:55 [AB]

No, but which parenthesis is the wrong one?

01:00:59 [ML]

Yeah, that, that's the big problem. But I mean, that's just there's no right answer to that. There's nice heuristics.

01:01:06 [HR]

Yeah, you can. OK, all you can do is say this is mismatched, but you can't say...

01:01:10 [AB]

But do you complain at definition time if the control structures are mismatched? So then it would make sense the parenthesis to that.

01:01:16 [ML]

Yes, there's precedent.

01:01:17 [HR]

Yeah, yeah, yeah. Those are things to think about. At least nice to have work to.

01:01:23 [ML]

Do So what I'm wondering is. So now you can you saved information so you can go in and see where. You can see exactly the function in the train that caused the error, but what if it's like... What if it's defined somewhere else? Do you go to the the place where that function is in the source code? So like if you have a function, if you have an operator, maybe that takes in and argument that can be a number of functions and you pass in the average function to it and then you run it and average. I mean, average is not too likely to get near, but if you had a function that, yeah, was tacit and did cause an error like what what happened there?

01:02:03 [HR]

The the error is going to be reported on the source line, so it's. Let's say we have a tacit verb. Well, now let's say we execute.

01:02:13 [ML]

So it's it's the, it's the bottom most explicit source line.

01:02:19 [HR]

Yes, yes, it'll be the. Yes, the bottom most explicit. So the bottom most explicit source line. It has a hook that looks like parenthesis, a space, B, space C parenthesis. So it's executing those 3. The verbs the if the failure is in C you would get the... That line would fail with the error porting pointing to C and the message saying error in whatever the primitive was. So you would have to... Which might be in C could be deep down in C Yeah, yeah. OK. But it'll give you. It'll look at the arguments to the failing primitives and tell you what the primitive didn't like, and it's up to you to go from there. At that point, maybe you should turn on the debugger so you can go through the call stack and see it.

01:03:13 [ML]

Or just look at the definition of C.

01:03:14 [HR]

Well, right, but C might be four more names that you know you. You can have a tacit of definition that goes.

01:03:22 [ML]

It it could be an arbitrarily complicated tacit function, it just can't be in in any explicit part of the function C.

01:03:29 [BT]

Yeah. And that breaking down and looking at the arguments to something that fails, would that be available for something that didn't fail? Like can we get access to that in the business?

01:03:37 [HR]

No, I mean. The problem is executing A. A sentence they're just oodles of places where primitives get executed. I would have to save all those results, and there's something special about the one that fails, right. It's the last one, and it's the one you care about. Well, but I know you.

01:04:04 [AB]

But what? What are you asking for about?

01:04:07 [BT]

Well, I've been thinking for a long time some type of a visual debugger where you can see the structure of the verb and then go in and basically logic probe different areas and see what the arguments are. But as somebody points out, you have to you have to created all those arguments. It takes up a huge amount of space for what you're trying to do, but if you're trying to figure out a verb. It you know it does give you a real leg up to trying to figure out what to go like, dissect in real time.

01:04:31 [HR]

Yeah. The dissect does that.

01:04:33 [AB]

But you but that only it runs through everything and then gives you the analysis, right? You can't step through it.

01:04:38 [ML]

Yeah, it's just. It's running it in its own mode, so it's got all sorts of extra overhead added, but if you're trying to figure out what the expression does. You probably don't care, right?

01:04:47 [HR]

That's the idea. Yeah, I mean it, it executes every verb on every, every individual cell. Result cell goes into a a table where it's kept track of it. Yes, tremendous overhead.

01:04:59 [AB]

Bob, have you seen the presentation from the Dyalog user meeting 22 where John Daintree does his token by token debugging? [16]

01:05:10 [BT]

No having not.

01:05:11 [AB]

Case and that's it sounds like something like that and it should be coming. Probably not in the first coming version of Dyalog, but at some point next one where you cannot just trace through statements, but you can go in primitive by primitive and inspect what arguments are they getting, what result are they giving and then when you hit an arrow you can see that as well, yeah.

01:05:35 [BT]

I'll have to take a look at that. Is from what Henry saying, when you put together a fork or a train you've created a new verb. You don't think of it as the individual parts. It's now a new verb that does the thing, and using the tokens, you're breaking it back up apart. And I don't know. That's the trick is between those two levels of understanding.

01:05:55 [AB]

But for analyzing what's going wrong or why you're getting a result that you're not expecting, or for learning purposes. Its tremendously useful to be able to follow along as the interpreter traverses this function tree and runs it.

01:06:11 [BT]

Yeah, I actually did a promo video. I built the whole thing out of keynote. It doesn't actually like it looks like it's working, but I've got it. I'll put it in the show notes. It's it's a video of what I kind of think will would would work and you just with the mouse you go around different parts and you can see the structure of the verb and tell what's happening. And it's kind of cool. I haven't put it into place yet, but it sounds similar to what John's doing.

01:06:37 [CH]

All right, I think we're. Actually don't know. I think we passed the hour mark start to tell when you hit the record button, but is there anything that we haven't mentioned in terms of, you know, primitives that were added or that's worth mentioning before we?

01:06:52 [AB]

What is use? Well, well, what is slash dot dot. [17]

01:06:57 [HR]

I think slash dot was incorrectly defined. I think in Dyalog you did it right. This one of those examples of waiting till somebody else does it wrong first.

01:07:09 [AB]

That's key. Ah, yeah.

01:07:10 [HR]

Key yes it the key the key. What key does is it classifies the items of the right arguments based on the values in the left argument, so subsets of the right argument, they have identical values of the left argument are grouped together and the verb is applied to them. You would say that the right the correct definition would be: Make that verb diadic and let the left argument be the left argument value that they all have in common. And the right argument value would be the subset of items to be operated on. Which I think is what Dyalog does, but it was not defined that way. It was just fine. It just throws away the information in the right in the left argument to key. So this slash dot dot merely it's just like key except it executes the verb as a dyad and gives it.

01:08:12 [AB]

Another left argument.

01:08:13 [HR]

Yeah, it gives it a left argument, you know? So if the left argument is a bunch of fathers names. And the right argument is a bunch of children's names. You might want to say well: Who are all the true, you know, let's connect with the fathers of the children with the old fashioned key. You get the children's names, but you wouldn't have the father's thing. You have to come up with that by some other means.

01:08:40 [AB]

Do you have to take the? Yeah, you you basically would be forced to I guess box and then doing each yeah.

01:08:49 [HR]

So yes, exactly. And so this this is just an improvement it's... I have wanted to have it half a dozen times in 30 years, but I think it's just properly defined this way.

01:09:03 [AB]

It's interesting because I think key was added to Dyalog APL just before I joined the company. And I know that there was Roger was implementing that and saying that well, they're fixing the issues with Chase Key.

01:09:14 [HR]

Right, yes.

01:09:22 [AB]

So yes, that's exactly the thing. So Dyalog I probably waited like 15 years or something after they got it until it was the language. Yeah, but actually I have a different problem with key and which is when it does what I wanted to do, then yes, it does it really well. But often I have a vocabulary that I want to look at. So let's say I want to use key to do letter frequencies then and if I just do key. Then they have two issues. One is if any letter doesn't appear then I don't get the frequency of 0 or count of 0 on it. It just doesn't appear in the result. And another one is that it includes things I don't want to count to punctuation and so on.

01:10:08 [HR]

Well, you could throw those away when it's over right there. What's normally done for the to get rid of the missing letter problem is you take your whole alphabet and prepend it to the text. So that everything appears at least once, and then subtract 1 from the count.

01:10:24 [AB]

That that's both awkward and potentially very expensive.

01:10:26 [HR]

Right, well, OK, append it to the end of the text and it'll it might append in place. We will at J, I don't know about.

01:10:33 [AB]

Dyalog. But then you need. Then you need to sort after that.

01:10:35 [ML]

There's also an ordering problem.

01:10:35 [AB]

Yeah, because. Yeah, if you put it at the end, then they will appear in whichever order they appeared in the input. If you put them in the beginning, they will now be sorted according to the alphabet.

01:10:45 [HR]

OK. I guess it depends. On what you want.

01:10:47 [AB]

Right. But then you have to sort afterwards and as well can also be expensive. So I was just curious that adding a new key but not fixing those issues. Yeah, time for slash dot dot dot something.

01:11:02 [HR]

I haven't seen those as issues much, but those are the high spots. They're not numerous performance improvements, but nothing to compare with threads and GIMP and the error messages.

01:11:11 [CH]

Well, we'll leave a. We always do this. But we'll leave a link in the show notes to the change log or I think there's a yeah release notes that document everything that's changed. [18]

01:11:18 [HR]

Release notes.

01:11:24 [CH]

I guess that's sort of the last thing and maybe the most important thing to ask. Is this available because I? Think I looked it up and right. Now I think the jsoftware Wiki still points to the beta.

01:11:35 [HR]

It was supposed to go out yesterday. I just got an e-mail from Chris and it's all working except to Mac installation. There's no changes to the interpreter. Problem is this damn change to the numbering system. You know, so you know something breaks because it's not 904 anymore, it's 9.4 it will be out before this podcast is released, I'm sure.

01:11:59 [CH]

That's I was going to say is the the yesterday for us is like 5 days ago for the listener. So hopefully by the time the listener is listening to this 9.4.1.

01:12:10 [HR]

I feel certain it will be out.

01:12:12 [AB]

We'll leave a link to the download page.

01:12:13 [CH]

Yeah, this is like time travel happening right now. It's currently for us. It's not available, but for the listener listening to this. Right now this.

01:12:21 [HR]

That's true.

01:12:22 [BT]

Well, it's available as a beta, right?

01:12:24 [HR]

Yeah, it's, it's.

01:12:24

This is true, yeah.

01:12:25 [HR]

It's the same as the beta, no changes since the last beta.

01:12:26 [BT]

It will be the same.

01:12:28 [AB]

But I wasn't able to download the beta either because I've broken link for it, probably because of the name of things.

01:12:32 [CH]

Uh oh. Dun Dun Dun.

01:12:35 [HR]

Yeah, that was a, that was a nightmare. I hope it's worthwhile. Like I think it will be.

01:12:43 [CH]

Well, this has been awesome. I mean, I'm definitely gonna have to. I actually I don't because I got a new OS. I don't actually have J locally right now, so I've been using J playground for the last couple of days or a couple of weeks. But I'll have to go and download this 9.4.1 to mess around with the tasks and see how much trouble I can get myself into.

01:13:02 [HR]

That would be great.

01:13:04 [BT]

I was going to say that's a good question. Would J Playground update to 9.4?

01:13:09 [HR]

That is I don't have anything to do with J playground. I'm in awe of the project.

01:13:16 [CH]

Probably the person that works on it is listening right now, and they're and they're thinking. Yes, it definitely will be updated. So, and if they're not thinking that well, I just basically put those words into your mouth. And so now you need to go make it happen.

01:13:30 [BT]

We'll see how busy, Joe is. But there wouldn't be anything in 9.0.4 that would or 9.4 that would keep Joe from using that that source would there.

01:13:39 [HR]

Not that I know of. I we I mean, we try to run on everything back to 32 bit Raspberry Pi. Well actually multi threading doesn't work on that.

01:13:50 [AB]

Now, yeah, that's what I was about to say. But hold on the the change playground that runs that is compiled. You web assembly, right? Running in the in the browser. How do you spin up OS threads in the browser?

01:14:02 [HR]

I don't know these are, I don't know. These are POSIX threats.

01:14:05 [CH]

I mean, I feel like if you're if you're running J in the browser, probably you're not in a situation where, like the fact that you can't spin up, you know, 12 different threads for your computation. I mean, I think if you're at the point where you need to spin up threads to get like a perfect crease maybe think about downloading. The actual executable.

01:14:27 [AB]

I have this coming up all the time about try APL. People are trying to like write I don't know production applications or something running on.

01:14:33 [CH]

So who I did get a YouTube comment the other day that said. I do most of my. They said that they found try APL better than RIDE, which I was kind of shocked by because there is a certain amount of latency with TryAPL like it's not terrible, but when you're used to like the hit a button and it's instantly there. Yeah, try I mean for what it's worth, I actually did use TryAPL for maybe like half a year or a year or something until I finally went and downloaded one of the editors. So maybe they're just at that point as well, but at a certain point, you start doing a certain a certain amount of work that the latency once you switch to it or try it out once you're never going back. That being said, now that we're talking just I'm rambling on about this BQN pad. I think I religiously use at one point, back when I had Linux. I did install the BQN executable or CBQN just to try out the little color syntax highlighting. That we announced that were Marshall announced at some point and it was awesome because like J does not have that, if you have, if you're running J on Linux or I think, actually even if you have like the little GUI thing on Windows, it doesn't have sort of like inline syntax highlighting and.

01:15:49 [BT]

So you can do that. You can do that by going in and it. Well, you can do that in JQT by. Editing your styles. The default style is just.

01:15:57 [ML]

This is in the. Terminal though.

01:15:58 [BT]

Ohh yeah, not in the terminal, but in the in the IDE in in JQT. [19] OK you can go in and edit. You can configure your style and then you just have to go into your. I think you have to go into your base. And then actually enable inline. But when you do those two things you can and you can convert to whatever colors you want.

01:16:15 [CH]

Want so? I guess so. I guess on Windows, if you're using JQT you can set it up. It's not by default, but on yeah on the Terminal 1, which is typically what I would use when I'm on Linux and there's only actually like a few interpreters. Like there's IPython, Elixir has one called iex that has syntax highlighting. I'm sure there's a Haskell one, although the default one GHCI does not and then BQN. So like off the top of my head of the languages that I've used, like only a couple of them have it. And it is like, I don't know, it doesn't really matter, but when you're used to being in like, you know VScode or choose your editor of choice like you like the syntax highlighting is actually a part of, like, you know, parsing and being able to read anyways. Point being, these online playgrounds and TryAPL and BQM pad. BQM pad alright. This will last 30 seconds of my monologue. Every other language should go and do something similar to BQNpad because I had someone comment that on a YouTube video the other day too. Is that like the in place updating where like you don't see the iteration of every sort of modification when you're building up some solution. What was it? Was it Stevan Apter? Was it Joel Kaplan that said the, you know, solutions by successive approximations or something like that, [21] which was anyways a great quote. Is that I just like, I don't actually want the to see the iterations. I just want to like add a tally to the front and then see it go down to a single number. And then add something which is like BQNpad [20] has both. You can hit shift enter to get the iterations if you want see it. But if you don't, and it's specifically it, maybe I'm biased because I make YouTube videos, but for a YouTube video it's absolutely beautiful because like, you're just building it up, your screen doesn't move, and then your solutions just like spit out to the screen and it's like, I'm sure Andre Popper, the folks that work on it, weren't thinking of like content creators. But it's like the perfect... like they went and built this thing and I'm just like, oh, this is awesome. Anyways, there's there's my I guess that's the third thing that I sort of put out into the world and asking people to create for me without actually doing any work.

01:18:20 [ML]

Myself, but now this is one place where the language semantics are really important because. So BQN does this without, at least we've tried to do it. There are probably, you know, bugs in one or two places, but it does it without ever messing up your previous results, so the the results you get are always consistent with. If you had just typed it that way in the 1st place, even though it's evaluated your statement and done preview, it can't always do that. So like in places where you modify a variable that's at the top level scope. Then it can't do that, but in most BQN programming you don't end up doing anything like that. So it can actually keep track of, you know what's, what's supposed to be there and then throw out the changes it makes? That it made while doing the expression so it handles a whole lot of code without, you know, without having a bunch of messy, you know, modifications and imperative stuff that it has to go back over and undo.

01:19:23 [AB]

I still don't understand how that you can actually do that. When you do change values of variables in the global scope, how does it not? Every time you modify the line.

01:19:35 [ML]

So it actually makes a new scope for itself to.

01:19:38 [AB]

Ohh, it makes a temporary scope. It clones the scope for every every time, so it doesn't affect it over and over again.

01:19:45 [ML]

And of course, any function that it calls is going to execute in its own scope. So when once the function is done, the scope disappears and. I mean usually uh. And so that it can just throw out. So basically it's the main mode of operations is that it's making space for itself to put these temporary results and then at the end it just throws it out instead of trying to undo any changes it made. Because that's really error prone.

01:20:11 [AB]

Yeah, OK, so just by cloning the scope.

01:20:13 [CH]

Look at this. Adám's already working on it in his head. Fantastic. And the and the beautiful thing about this is that but.

01:20:19 [ML]

It's very important that it has this very clean lexical scoping model for that. Though I mean not every. I like I think they have it in JavaScript, but I it's, you know, taken a huge amount of effort to support that, and they've gradually extended it to more kinds of things. I mean, even though JavaScript has lexical scoping, it's just that more features than the language makes it much harder to do.

01:20:44 [CH]

Yeah, I'm I. The cases that I really care about too are not the complicated ones. It's just like the single sort of expressions with a couple of unit tests. And the thing I was gonna say too is. It This helps with the issue of not knowing how to get the previous line like in J Playground, I actually don't know how like cause in RIDE it's control shift backspace, in BQN I don't know it's control something, but I don't know him BQNpad because like, I just am always just modifying it in place. But in J playground, [22] I don't know. Like I always just have to hit the up arrow and modify it. Hit enter like I don't know how to get the previous line to show up on my new line and for the longest time and ride. I didn't like. It's a 3 character shortcut like I was always just doing the old thing untill one day, you know, I saw some YouTube video by Adám and he spelled it out because not all the times are you spelling out I know. Bob is furiously trying to type to figure out what it is.

01:21:38 [BT]

No, I'm giving away all my tricks here.

01:21:42 [CH]

The thing is, is that like I can I. You know, I love these languages and I can never remember. But the beautiful thing with BQNpad, I mean it's only. One of many beautiful things. Is like I don't have to remember I just start typing keep going anyways. We're holding Henry hostage here, doing like an entire different episode.

01:21:58 [BT]

It's control shift up arrow.

01:22:01 [CH]

There you go. So another three combination control shift up arrow good to know. Anyways, we'll wrap up this lightning talk at the end of our episode with Henry. Thanks so much for coming on Henry and telling us all about this and yeah, listener, check the shownotes or just go to Jsoftware. And you can find the links to download this and to look at the release notes for anything that we may have not mentioned. OK, Bob, people can reach us at.

01:22:27 [BT]

contact@arraycast.com [23] and we actually did get a a reply from Daniel. We mentioned him in the last letter. He was very touched by our information.

01:22:39 [CH]

Uh oh, we're in a recursive infinite loop here now, yeah.

01:22:41 [BT]

Yeah. But he said don't worry about giving away. He says there are. There are several Daniels involved with clojure development, so he's not too worried about you giving away his identity there.

01:22:54 [CH]

And I we didn't get to, we had on the back burner a topic of J Folds, [24] which is the capital F suffixed by either a combination of dot or a colon. We will have Henry back in the future, maybe not next episode, but in the next few episodes because it's something I was looking at. And I think the listeners would be interested to hear that conversation. So if you're willing, Henry, we will have you back and hold you hostage once again?

01:23:20 [HR]

Would love to.

01:23:22 [CH]

All right. I guess with that we'll say happy array programming.

01:23:24 [ALL]

Happy array programming!