• Safe and secure

  • Quick and easy

  • web-based solution

  • 24/7 Customer Service

Rate form

4.1 Statisfied

459 votes

To Fill In Tr 13a, Follow the Steps Below:

Create your Tr 13a online is easy and straightforward by using CocoSign . You can simply get the form here and then write down the details in the fillable fields. Follow the instructions given below to complete the document.

Fill out the customizable sections

Customize the form using our tool

Fax the completed form

  1. Look into the right document that you need.
  2. Press the "Get Form" icon to get your file.
  3. Check the whole form to know what you need to key in.
  4. Enter the information in the free-to-edit parts.
  5. Double check the important information to make sure they are correct.
  6. Click on the Sign Tool to design your own online signature.
  7. Drag your signature at the end of the form and press the "Done" button.
  8. Now your form is ready to print, download, and share.
  9. If you have any doubts regarding this, don't hesitate to contact our support team.

With the help of CocoSign's eSignature solution , you are able to get your document edited, signed, and downloaded right away. All you have to do is to follow the above process.

Thousands of companies love CocoSign

Create this form in 5 minutes or less
Fill & Sign the Form

Hand-in-Hand Teaching Guide to key in Tr 13a

youtube video

Tr 13a Demand Assistance

shall we get started cool yes we have.fewer people today normally we have more.what happened is there is there a break.or something that I don't know if not.not that you know also I guess maybe.they have some break in their minds okay.so did you all enjoy the lectures last.week on memory yeah you got explosives.I'm really cutting edge research in.memory basically memory latency errors.and also you've seen some examples of.accelerating some important workload.like machine learning and sparse matrix.calculations so if you haven't if you.didn't attend please review the lectures.there's an exam coming up next week if.you didn't know.last week's lectures are also part of.the exempt anything we've covered.including today and then there will be a.review session tomorrow so please come.to their use session if you solve the.questions already you will learn a lot.more from the review session it'll.benefit a lot more homework questions of.course right okay.so today we're going to cover memory.controllers which is what we were.talking about but we never really went.into the scheduling aspects of memory.controllers that's what we're going to.do today so that sound good.okay so let's jump into it and your.assignment actually your third lab is.out already.which is on memory memory scheduling.hopefully you'll have some fun with it.this time you're going to move to a.different infrastructure you're going to.use the RAM later infrastructure which.is this the DF simulator that we talked.about when we talked about simulation.which we developed which is being used.across academia and Industry so you'll.get exposed to some state-of-the-art.simulator as opposed to building your.own simulator which you did in labs 1 &.2 now you're going to take some other.simulator and extend it with memory.scheduling policies to known memory.scheduling policies which we're going to.hopefully get to and to and you'll also.have freedom to create your own memory.scheduling policy which will hopefully.outperform the previous one.right so you'll have a flexibility to.create and this is a very good area to.innovate on as you will see in a little.bit okay so let's jump into this memory.controllers well you know what these are.actually these are Hardware structures.that control memory in the end but even.though we're going to talk a lot about.the UM other types of memories also need.memory controllers after any memory.needs a controller in the end and.especially long latency memories have.similar characteristics that need to be.controlled and the following discussion.in this lecture and maybe in the next.lecture also you will use the UM as an.example but many scheduling and control.issues are really similar in other types.of memories for example flash memory if.you have time we will talk about flash.memory later in the lectures other.emerging memory technologies like phase.change memory is TTM Ram we will.definitely have a lecture on emerging.memory technologies after the exam week.and we're going to talk about these and.you will see that they they also require.similar scheduling mechanisms and also.in addition to the year and these.memories are actually more complex.because they place other demands on the.controller it turns out errors are a big.problem in all of these memory.technologies even bigger than DM even.though you know that DRAM has a lot of.errors today black rope hammer these.memories also have a lot more errors and.are different a lot more different.failure mechanisms for example flash.memory has a wear out problem if you.keep writing to itself after some number.of writes you cannot write to or read.from the cell basically the cell becomes.non operational that's called wear out.and the memory controller manages to.ensure that cells wear out evenly so.that there is no part of memory that.wears out earlier than the other part is.called we're very leveling or we are.leveling to be more precise and memory.control is responsible for managing that.you could also have operating system.managing that also if you think about in.your in your mind but how do you do that.is a good question so that exists in.phase change memory also a phase change.memory as you know as we discussed is.already out there intel has their.persistent memory technology and memory.controllers do very leveling in.patient's memory as well so before we.move into the.I'd like to very quickly give you an.overview of the more complicated memory.controllers if you will this is an.example of an SSD controller these are.in a sense very similar to memory.controls except they're much more.complex they have to deal with a lot.more stuff and there's clearly specific.to something else some other type of.memory flash memory and flash memory has.basically a lot more needs if you will.so these controllers for example.essentially they had they have some.processors they have some parts that are.general-purpose processors they also.have some parts that are really hardware.and firmware basically because they need.to do the decisions very quickly for.example a lot of the ECC engines are.implementing Hardware in there so they.need to do very complex error correction.so they may need to correct for example.40 bits or so across 8,000 bits because.there are a lot of areas that you see in.these memories they do very leveling as.we discussed they do voltage.optimisation because whenever you do a.read you need to have a reference.voltage and how you select that.reference voltage determines your error.rate significantly if you pick a wrong.reference voltage you may actually.incorrectly read the data as we will see.later on also lead to other things like.page remapping so to and to enable very.leveling also they because of the.characteristics of the memory some.things are very different for example in.the erm reads and writes are symmetrical.right we have never thought about reads.and writes differently as you but but in.flash memory if you want to write to a.page you first need to erase the entire.page and that erased operation actually.takes a very long time at least an order.of magnitude longer then I read.operation which means that these erasers.need to be carefully scheduled and you.don't want to do it whenever you want to.write to a page basically you don't want.to you rate the page at that point in.time you really want to have a pool of.erased pages for example that you write.to at that point in time so that's that.that leads to issues like garbage.collection for example when you actually.decide that a page should be erased and.when do you actually invalidate the.pages over there and then you need to be.map the pages so we will talk about that.later on I just wanted to give you an.overall idea but if you're interested.I'll recommend some papers.okay there's also actually this is still.in show it nicely I'll go to the next.picture if you look at a flash memory.controller there's the eminent why.because flash memory is these are flash.memory chips it's very slow.comparatively so the memory control.incorporates some diem into it.which means that it has to have idea I'm.controller also inside there it.incorporates some deer a minute to.manage the data internally so this.theorem can can be can serve as a cache.to flash memory it can serve as a right.buffer also remember that whenever you.write to a flash memory self they're.degrading its reliability.that's the wear out problem so why don't.you write to DRAM first and not to write.to the flash chip and consolidate many.rights to the same page that means that.you're going to reduce the number of.writes that you do to the flash chips.which means that you'll increase the.lifetime of the flash chips.so you're basically using the DM itself.well DRAM itself over here as a right.buffer to maximize a lifetime of the SSD.which you're really controlling in the.end so as you can see this is a very.complicated controller that doesn't have.only a flash controller but also as a.dram controller and there are a lot of.other mechanisms that are employed today.clearly we talked about the ECC engine.with their scramblers that try to.maximize reliability again there's also.a compression employed and some.assistance there's an encryption also.employed and this this paper if you're.really interested gives a very good.overview of that but we will hopefully.have a separate section lecture on SSDs.and then we'll cover these more but this.gives you an idea that actually I'm.exists everywhere DRAM is so ubiquitous.today okay so if you're interested I'd.recommend this paper that we wrote.actually it's a couple of years now but.there's there's an updated version that.I'm going to reference also this gives.you a very good overview of how an SSD.controller looks like and then it.focuses on especially the error.mechanisms we talked a lot about memory.errors flash errors are even more and.this is the result of the eight years of.research that we've done in the area of.flash and SSDs.and it incorporates a lot of other a lot.of techniques that are employed and real.flash controllers and let me give you.some examples as you can see this is a.matrix over here this is the area.different air types that you see and.these are different mitigation.mechanisms that are employed in existing.controllers you can see P cycling this.is really wear out whenever you're.programming or writing data you actually.induce errors cell-to-cell interference.there's whenever you're actually doing.something to a cell you're disturbing.some other cells data retention is a.problem and we disturb similar to our.camera it's a problem also so these are.different types of error mechanisms you.can see that they're a bunch of the.mitigation mechanism which we're not.going to go about go talking about right.now.voltage optimization whether but I'm how.do you determine the voltages refresh is.clearly another one actually flash.memory controls employee refresh as we.discussed earlier in lecture one or.lecture two that's a that's an important.mitigation mechanism also but you can.see that not all mitigation mechanisms.actually fix all of the errors some of.them are very specific some of them are.very general as you can see hot data.management for example can reduce the.impact of many errors because you try to.move the hot data to a different.location so that you don't wear out some.parts as much right so it actually helps.a lot of the error management mechanism.if you're really interested I recommend.looking into this paper at some point we.may cover this in more detail and this.is the updated version of a lecture this.was even though there's there's a newly.updated version that's coming up ok this.is this cover 3d stacking also if you.remember we talked about 3d stacking and.flash elevators the difference between.this one and the previous one is this.recovered 3d stacked flash memory here.there's of course clearly a lot of other.stuff on SSDs which we're not going to.talk about right now but their.simulation plane works that you can.develop that are being used also and.also things that like scheduling.mechanisms that are important for flash.memory also that we're not going to talk.about right now any questions.I find SSDs fascinate fascinating also.it's a system by itself basically if you.look at an SSD it's pretty much a system.by itself you can do computation inside.there also because you have processors.inside there of course the downside is.well it's a system by itself you can use.it here I'm over there to minimize all.agencies as well in a sense it's a.hybrid memory right.you really have DM plus flash memory you.can also incorporate some emerging.memory technologies in there as well.yes.[Music].it can it can water I say we map it yes.absolutely.so that that's what this is called.over-provisioning of memory so all SSD.drives over provision memory they have.additional memory that they don't.advertise but so when some cells wear.out they do remapping of the worn-out.cells to some other locations in this.memory that's a very common technique.and that's one of the reasons why you.need page remapping in flash memory okay.any other questions that's increasingly.being employed Andy I'm also actually if.you that's a very good technique in.general having these additional spare.rolls or spare columns that you can.activate dynamically yeah there's there.are mechanisms to detect wear out we'll.talk about that one if you're interested.you can read the paper but we'll talk.about that when we talk about flash.memory okay okay so let's move to the ER.now basically as you already know the M.actually has many different types you.actually have a paper that you're going.to read that talks about the workload DM.interactions that's part of your new.homework I don't know if it's released.yet but it will be released sometime so.DM has different types with different.interfaces optimized for different.purposes.Cromartie um that we've seen DDR.low-poverty and we talked about high.bandwidth DRM this is especially for jet.graphics low latency DN types embedded.DM is of course very costly it's.embedded into logic but this is also.very costly reduced weight is the army.talked about that 3d stacked DRM with 3d.stacking and a bunch of other things and.the key thing to take away from this is.underlying mic architecture is really.fundamentally the same in all of these d.apps what really differs is the.interfaces and also the signaling.mechanisms that you have at the circuit.level so a flexible memory controller.ideally would support me.the various of these DM pipes you can.write a very long code that actually.supports all of these DM types it'll be.a lot of code but you could do it and.this clearly complicates the memory.controller and as a result most memory.controllers don't do this and as a.result it's difficult to support all.types of DM and upgrades of the DM also.so you design your processor with ddr3.memory controller.five years later Newton ma technology.comes out like gddr5 let's say you.cannot move to it because your memory.control is not designed for that.interface first of all it's not.programmable it's you don't have very.low code for it let's say but you more.importantly there are analog interfaces.that are in the UM that that that you.need to interface the processor pins and.the D on ships with and all the.interface are complete different in.different technologies and those are the.interfaces that make the difference.actually that's enable very high.bandwidth for example in gddr5 because.remember underlying microarchitecture is.mostly fundamentally the same the differ.in terms of number of banks as we.discussed in terms of robot resizes.things like that maybe number of ranks.in some cases but you get very high.bandwidth because of the interface and.that analog interface is really the.difficult part to design if you want to.have a flexible memory controller so.let's say let's assume that you want to.support five of these things do you have.five different types of analog.interfaces it's it's actually a lot of.part of your chip those analog.interfaces it turns out they're also.very very expensive they need to be.designed at high speed like gddr5 for.example operates at extremely high.speeds even ddr4 today actually operates.at close to three gigahertz right now so.designing those that interface actually.takes a long time so we're not we're.clearly not talking about the analog.part of it in this lecture right this is.a computer architecture class but keep.in mind that whenever you're interfaced.with an external take an external memory.you also have an analog portion and in.this case what really limits you to not.have not not be able to support many.different types of memories is really.that analog.Porcia because you could imagine having.many many memory controllers in very log.written right fine you have all of them.but how do you actually interface with.different chips that becomes a.bottleneck that's why you don't see.chips out there that supports even two.technologies they normally support one.technology like ddr3l this one I believe.supports lpddr3 for example this one.also I think lpddr3 but it could also be.ddr3 okay so this is one complexity in.the design of the memory - how do you.support multiple different technologies.if you overcome analog then it's it's.really you figure out how to design the.different modules and you've already.seen this this is just a picture from.the Revelator paper so let's take a look.at the functions of the DM controller.clearly there are many functions active.first key first function is really.unique to ensure the correct operation.refresh and timing we already talked.about both of them.on top of this you need to service the.I'm request while obeying the timing.constraints and there are many.constraints you have banks pluses.channels resource conflicts you need to.keep track of those you have minimum.delays right three delights we're going.to see some of these delays you've seen.some of these delays last in the last.lectures when we talk about latency but.as we will see there are more than.hundred of these different timing.parameters because of the way we design.DM to minimize the latency as much as.possible that's the best of latency.optimization that we have in DM today.it's not huge but it's it enables you to.do some they have some flexibility in.the memory controller to optimize the.Layton sees I need to translate the.request to the DM command sequences need.to buffer the memory Couture needs to.buffer and schedule requests for high.performance and quality of service this.requires reordering managing the row.buffer as we've discussed right how do.you how long do you keep the row buffer.open when do you close it you will see.the sector in your lab Bank rank and.most management and on top of this this.is all correctness performance and then.on top of that you have power and Energy.Act and thermals how do you actually.ensure that you don't use power.unnecessarily right because any power.you use unnecessarily is power that you.take away from computation also we need.to be careful over here we're not going.to talk.much about this we're gonna talk a lot.about this part you know of it so in.modern ear I'm controller this is not.the best picture but it kind of looks.like this basically you get requests.from different cores and i/o devices.also and then you have some sort of.arbiter and then you decide which Bank.they request growth so for example or.which we channel if you have multiple.channels and then you need to you choose.the requests buffers and then you you do.command scheduling and this is the.analog part that I was talking about.basically once you schedule a command it.needs to go through this analog.interface and communicate with vm.through the analog interface and the.data needs to come through that analog.interface also and these are pins that.you need to drive outside the chip so.they're huge actually that's that's why.you cannot have many of them it's not.only that they're huge it's not the only.reason that you cannot have many of them.it's also that this is a complicated.thing that needs to operate at very high.speed and also hopefully low power so.that actually increases the area quite a.bit ok this is my picture i simplified.microarchitecture level picture from.this paper if you were going to.hopefully we talked about basically you.it's it's similar you get requests from.caches and then they go to they get they.get routed to different to request.buffers in the banks and each bank has.some independent occasion this is one.way of designing the memory controller.each Bank has an independent scheduler.each bench Bank selects a request that.can be scheduled at a given point in.time and then there's another arbitrary.over here the bus schedule that picks.which banks request need to go out on.the memory bus of course in one cycle.you can schedule only one requests.because you have a single address and.command bus that goes into the VM in the.next cycle you can scale another one in.the next cycle you can schedule another.one okay so we've seen some scheduling.policies already the simplest one is no.scheduling in a sense FCFS first-come.first-serve you don't need to do any.scheduling right you just send the.request schedule the request as they.come in the order they come it turns out.that's not very high performance as we.will see as there is now people I've.developed this policy as we also.discussed first radio first-come.first-serve this policy prioritized over.four hits over other requests and all.else being equal it prioritized all.request over other requests you've seen.this before and the goal here is to.maximize your robot for hit rate because.row buffer hits are much shorter latency.than row buffer misses or conflicts and.we've seen this which maximizes the app.throughput right which also minimizes.latency assuming that these are the only.things that affect latency but we will.see there are a bunch of other things.okay so this is commonly employed.actually scheduling is done at the.command level as you've seen left last.week column commands read and write are.prioritized over row commands activate.and pre-charge keep this in mind there.many memory many different types of.memory controller design that are.designed out there and many companies do.different things some of them for.example what they do is they they do.some scheduling at the request level.they look at the requests and then after.they do the scheduling they break the.request into the commands and then the.commands get scheduled in first-come.first-serve Ward that's one way of doing.it you could also do scheduling at the.request level so what it was request or.request level is just you look at read.to address X write to address why you.don't have the commands right at that.point you don't know if the request.needs an activate or pre-charge or.something else right or it means it.needs just a read or write you do some.scheduling at the request level and then.you translate it into the commands and.then you can do scheduling at the.command level also you can actually have.a multi level scheduler I'm not going to.go into the trade-offs of this but you.can imagine that they're different this.is a more finer gain scheduling you're.not scheduling requests you're.scheduling pieces of requests in this.case and each of them has advanced and.disadvantages clearly for example a lot.of IBM schedulers do a request level.scheduling first and then they do.first-come first-serve at the command.level although they're complicating.their schedule is also going forward.okay I think we've already discussed.this this is basically the command level.implementation of first ready first-come.first-serve column commands are.prioritized over all commands and you.have two groups now call them comments.and roll comments within each group.older commands are prioritized over.younger ones that's how you implement.first come first rating first come first.serve okay I'm going to go through this.quickly because I've shown you this.before right is just to jog your memory.we have this row buffer and you know.that row buffer hits are faster than.your buffer conflict.you've seen this animation I think.multiple times before so you can do it.on your own again.you remember okay good any questions.while I go through this animation we can.overlap the latency of animation with.your questions overlapping latency is a.powerful thing okay so okay what's the.scheduling policy scheduling policy is.essentially a prioritization order and.this prioritization can be based on many.many things here you can really exercise.your creativity it could be based on.request a each clearly we talked about.that's first to come first serve older.first robot for hit miss status it could.be based on request type is it there.read or is there right is the prefetch.we're gonna talk about prefetching.actually later on you will see that the.prefetch is trying to bring the data.before it's really requested by the.processor how do you order them compared.to reads and writes right this is type.at the request double you don't have a.prefetch request command in the I'm.clearly at the request level you have a.prefetch request do you prioritize them.ovaries or writes how do you do that.prioritization actually turns out that's.not a simple prioritization there's no.simple answer to this if you prioritize.let's call these demands these are.demand requests from the processor.if you always prioritize demands over P.fetches you don't get the best.performance if you always prioritize.pre-placed or over demands you don't get.the best performance because there are.some cases where it makes sense to.prioritize prefetches because it for.example a pre-flight request comes to a.row buffer that's already open and if.you actually serve that prefetch request.you could quickly get one more date with.each piece of data out of the row buffer.without delaying the demand request too.much and if that prefetch request is.useful actually you've done a good thing.right but of course if that prefetch.request is not very useful you don't.know this beforehand of course but you.can predict it then you may not want to.do that prioritization of prefetch.requests or demands so it's the right.answer over here is really being.adaptive you know what you really want.to adapt to the accuracy of the prefetch.request whether they're going to be.useful and also the status of the DRAM.itself if the prefetch is request.request is causing a row buffer conflict.you may want to be very careful about.whether or not you're going to.prioritize a pre-flight request or a.demand request does that make sense okay.if you're interested in this we'll talk.about that but there's a work that we've.done a long time ago that's called.prefetch aware memory controllers and.yeah there are a lot of things so.request your type is another thing right.this is at the request level again is.the load miss or store miss a load miss.is a load instruction that's causing a.cache miss that's out in memory and load.instructions clearly waiting for the.data so that the processor can proceed.but store miss store instruction may not.be really waiting for the data right if.especially if you're a right buffer and.if you if you if you aren't sure that.stores don't cause a lot of problems.because sore misses can be handled off.the critical path as much as you can.it's also important there similarly.actually there's another thing over here.which is is it is there right back.from memory if it's a right back then.it's really not delaying the processor.correct but of course not in all.conditions because criticality now we're.getting more fuzzy right but how.critical is this request to the progress.of the processor how do you define it.first of all that's also another and.that people have defined have had many.different definitions for example is.that the oldest miss in the core so the.core can generate let's say five misses.is this the oldest one all this one.assuming that the oldest one is really.blocking the core right how many.instruction the core are dependent on.this request that could be another.definition of criticality right if you.actually finish this request early.you'll in able hundred different.instructions to execute which is.possible it's because all of those.instructions may be waiting for a single.piece of data so people have tried to.figure out definitions of criticality.quite a bit it turns out many of the.cache misses whose latencies are not.overlapped are actually critical from.the perspective of the core but the key.question is of course is if it's latency.is overlapped how do you actually figure.that out okay well let's call the.processor that's basically the.overlapping example so these are.different difficult prediction.mechanisms actually their prediction.mechanisms people are proposed another.aspect of this which is really not.completely independent of this one what.is the interference.that this request is causing to other.course rate that could be a that could.be an input that you take in terms of.how you do the prioritization of the.requests if for example this request is.causing a lot of interference to the.course if you schedule it maybe you.don't want to schedule it maybe you want.other course to go first right that's.the idea and you will see this a lot in.this lecture and you're going to.implement actually in your lab two.different schedulers that try to solve.this problem one of them is going to try.to solve the problem for performance.maximizing performance and the other one.is going to try to solve the problem for.fairness and performance and complexity.at the same time and there are simple.schedulers okay and there may be other.things you can you can imagine other.things does anybody have an idea what.else you could put over here in the data.people have imagined a lot dot dot dot I.didn't put everything over here clearly.but you could imagine things also any.thoughts.okay maybe you'll have more thoughts.when you do the lab okay so let's talk.about the row buffer management a little.bit this is an important aspect it's not.the only aspect clearly but I want to.talk about it before we go into more.scheduling policies there are multiple.different policies to manage the row.buffer one is called open row basically.you access a row and the idea is to keep.the roll open.of course the anticipation if you if you.do this you're implicitly anticipating.that some other requests will access.that row right.so if that's the case then you've got it.right basically at a row hit and you did.the right prediction if you will but if.the next access in that Bank is to a.different row then you get a roll.conflict and you'll do not only lose.performance for that request but also.you waste energy because you kept the.roll open keeping the roll open actually.consumes energy because row buffer is.active and row buffer is connected to.the cells if you remember the UM.operation it's really consuming energy.all the time and this is actually very.very important in mobile devices that.look like this okay so that's one policy.the other policy is closed row so I'll.tell you the reasonable implementation.of this policy basically the idea is to.close the row after in access reasonable.implementation says if no other requests.are already in the request buffer that.needs the same row of course you can.ignore this and you can say you always.closed over access or in access but.that's probably not a good idea if you.already know that there's another.request that wants this row right.so normally reasonable implementations.do this if there's if there's no other.requests to the same row they close the.row right after in axis of course this.this implicitly predicts that next axis.it goes to a different row and it avoids.are all conflict if that's the case but.if that's not the case now meaning that.the next axis actually goes to the same.row you know cause the next extra.activate latency right in fact extra.pretty hard latency also potentially.that's the idea so these are two.different mechanisms and now it once you.see this you may actually think why.don't we just predict right adaptive.policies do essentially that basically.they try to predict whether or not the.next access to the bank will go to the.same row and.accordingly clearly if the prediction is.it's going to the same role they keep it.open.otherwise they close zero and these are.implemented in existing memory.schedulers although there's not a lot of.literature that really talks about it.you can see some patents related to it.okay it's fascinating isn't viewing just.throw over for management this is.important because it's it's really.impacts your energy significantly and in.many memory controllers what people what.memory control designers do is they have.a timeout if they don't actually if the.row buffer is open for too long babies.could close the row buffer so that they.save energy of course if you have a.prediction driven mechanism it's much.better okay so this is just a summary of.the policy that I just described I'm not.going to go through this basically this.this really gives you the policy the.first access what's the next access and.the commands needed for the next axis so.for example if if you have a closed door.policy if the first X is 0 0 and the.next access is to Row 1 yeah you.basically activate your one read and pre.charge for the next axis you don't need.to do this pre charge basically so you.compare these two rows but that's ok ok.before we go into more scheduling I will.talk about power management a little bit.because this is becoming increasingly.important yam chips existing VM ships.have power modes I don't think they're.enough I think this needs to change.going forward but basically the idea is.very simple man you don't access the DM.chip powered up and there multiple.different power states active is the.highest power actually there's the.highest power is really when you're.accessing the app ship when you do a.read or write access which is not here.clearly that's based on access it's not.a power mode in a sense when you're.accessing if you consume the highest.power but when you're not accessing it.Roboto is active that means that you're.still consuming power you need to still.supply power that's the highest power.the next this is an example set of car.states all banks are idle this means.that row buffer is really not active in.any of the banks.that's another power state power down is.there any one more.even more aggressive mode where you.actually cut the power to a lot of the.chip so it takes more time to actually.power up the chip so they can access it.I don't know where this sound is coming.from let me see.maybe I'll reduce there yeah let's try.this and then the lowest power state is.self refresh which is what this is doing.actually right now this is the memory.here is in self refresh State at this.point it's really not consuming any.power other than refresh that it needs.to do so that it needs to keep the.memory active because it may actually.receive a call or someone may interact.with it so that you can actually you.don't want to put the memory into the.SSD so that you can you want to minimize.the latency of data access that's rights.and self refresh it could be also in.hibernate mode in essence right but it's.not doing that also there's some leakage.power that's associated with the.circuitry for refresh but of course.these power states cost trade-offs state.transitions in care latency during which.the chip cannot be accessed for example.this isn't self repress right now I take.it and I press it and what happens is it.goes out of the self we fresh modern it.goes into the memory access mode and.that takes some time and it takes an.order of microseconds because it will.needs to power up a bunch of circuitry.inside there so the question of course.isn't when you're a memory controller.trying to manage power when do you go.into the self refresh mode should it go.into the self refresh mode right away.after I do something or should it wait.for some time right yes.yeah so absolutely those are those are.very good points we kind of talked about.it in the in the refresh part existing.controls actually don't do that as much.but going forward we'll need to do that.a lot more basically we need to be more.aggressive in terms of management.basically today for example there's no.mechanism to turn off the bank's.separately you're going to self refresh.mode that's one yeah and there's no.mechanism to turn off the Refresh as.well on a per Bank basis exactly yeah.exactly and you can see the way straight.it's such a waste in the end yeah those.ideas have been proposed basically for.example if if you know that part of your.memory is not allocated while you're.refreshing it today but today we don't.have the interfaces to give that.information all the way into the memory.controller I think these are simple.cases where you can gain a lot in.systems that's a very good point I don't.know why this is making sounds Jeremy do.you know.okay any other questions like yes let me.let me think of it so it really depends.on how you lay out your data in the.memory when you when you actually access.a cache line you're getting it out of a.single Bank next cache line if it's in.the next Bank then I think you're right.but if a lot of your data is just put in.a single Bank or two banks then you can.really turn off the other Bank so it's.that's a really good point basically.there's an interaction between your data.mapping policy and which banks you're.accessing am I doing this or is it.happening something in some other way.okay let's try it this way.yeah there's something here okay so.basically if you're if you really want.to take advantage of even lower power.you need to map your data nicely as well.okay I don't know why.maybe this is a faulty one this never.happened the year before so we're going.to switch to this one fewer wires and.less interference let's hope this works.okay okay any questions yes one more.absolutely yes yeah it's a trade off in.the end right if you map your data.nicely maybe you lose some Bank.parallelism because you're you're.keeping your data only in one Bank.exactly except exactly so it really.depends on what you what your access.patterns are also like.this kind of goes back to my example.from Zurich Airport if you remember.right the train divided into health your.data is mapped into one one channel only.you're not utilizing the other channel.right you could keep that idle you could.power it down somehow although that's.not done at the Zurich Airport clearly.it's really wasted but that's that's the.example any other thoughts you're.hitting some fundamental points.basically these are really things that.need to be handled better but there's.not enough research I think in these.areas ok so let's talk about difficulty.ODM control a little bit more before we.go into some human design scheduling.policies so let me go into a bit more.detail basic revoir TDM control is.difficult to design actually these are.these are actually really difficult to.design components I don't remember if we.discuss discuss have you heard about.self optimizing memory controllers.before ok good.but so we're going to discuss it I.talked about it briefly in my earlier.lecture but I don't think anybody has.taken those lectures they still.circus-like who was in digital circuits.before ok no one here that's good.so let me go through this and then I'll.give you the story basically there are.many DM constraints actually 50 is an.understatement.there are more than hundred DM timing.constraints in the end for example.there's a constraint that says write to.read latency the minimum number of.cycles to wait before issuing a read.command after a write command is issued.and this is at the rank level basically.if you assuming a 1 rank is one channel.is one rank if you can only do a read or.write to the rank you cannot do reads.and writes simultaneously that's what.this says if you want to switch from.write mode to read mode you to wait for.some time the same is true for we try to.read mode to write mode so this clearly.now improves a very not-so-nice.constraint you have to treat reads and.writes differently fundamentally so why.is this constraint there this constraint.is there because whenever you're driving.the memory bus you drive it only one way.you could potentially make it driven.both ways but that becomes more calm.electrically so it's only one way you.need to turn around the bus if you.actually want to drive it the other way.basically when you're writing you write.the data into the DM chip when you're.reading you read the data out of the DM.chip so that's one way so you need to.turn around and that takes this called.the bus turnaround time and this is the.timing constraint that takes into.account the bus turned out time it could.be on the order of nanoseconds we will.see so what what do existing memory.controls do they try to actually patch.the write requests what they do is they.accumulate the write your question in a.write buffer and they wait for some time.before they start servicing those write.requests and if that write buffer is.almost full or after a high watermark.they say okay it's time to switch to the.write mode and when they switch to the.write mode you cannot serve sweets from.that channel.that sounds bad right because weeds are.critical for performance rights are not.critical for performance unless they get.in the way of reads which and in this.case they may be getting in the way of.reads right so the question is when do.you switch between writes.write modes and read mode it turns out.this is a huge effect on performance as.we will give you some reference to the.papers to do it carefully and this is.something we have not even considered so.far right and there are a bunch of other.things GRC you already seen this is a.minimum of number of cycles between the.issuing of two consecutive activate.commands to the same Bank this is really.what the I mean if a DM is thought of in.terms of latencies really the DM latency.back to act act activate latency to the.same bank and there a bunch of other.ones I will show you some more okay this.is timing constraints you need to keep.track of many resources to prevent.conflicts channels banks ranks data bus.address bus row buffers clearly you.should ensure that no two banks load.data to the database at the same time.right that should not happen.you tend all the yummy fresh on top of.this today it's simple but it's not as.simple also the even DME fish has some.flexibility even though every row needs.to be refreshed every 64 milliseconds.while today every 32 milliseconds so you.have some flexibility in terms of why.you can schedule those refreshes so you.can pull in some refresh as early you.can pull out some refresh as late and.that actually gives you some flexibility.in terms.handling the schedule of requests we.will not go into a lot of detail I'll.point to some papers over here but this.is important for correctness as well as.performance need to match power.consumption and you need to optimize for.performance and quality of first.independence of all of these constraints.and it turns out reordering is not.simple we talked about reordering but.it's not a very simple especially if you.have very large buffer sizes and on top.of this if you start thinking about.fairness and quality of service to.different cores different agents that.are accessing memory that complicates.the scheduling problem so many many.things to think about so these are some.examples this is actually the paper that.talks about right to read scheduling.it's one of the first papers in the area.that goes into a lot of detail but let.me give you this is the where's a week.just the right to read latency for.example at six cycles sixty you have.cycles that's a lot of cycles that are.wasted and these are a bunch of.something some of the constraints if.you're really interested these two.papers that some of which you read.actually cover these constraints so why.do we have all these timing constraints.it's not like a mess right actually if.you go into this one you've already seen.some of these timing constraints in the.last lecture that's why I'm not going to.it into a lot of detail but if you let's.let's look at one of them so this is a.bank in pre-charge state you activate it.using an activate command and before you.read or write to it you need to wait for.some time you to wait for some time so.that the data gets ready to read in the.sense amplifiers right that's the reason.so there are really physical reasons for.these timing constraints and clearly.people have specified different timing.constraints for a different physical.operation phenomena in D F so this is tr.C D that's activate latency you can see.that this activate to read/write this.scope is Bank level and it's value is 13.to 15 nanoseconds and ddr3 for example.there's something else this is let's.let's pick another one yeah this is read.to write for example basically this is.the it dictates the timing distance from.a read command to write command and.forced at the rank level you can see.that its value it's 11 to 15 nanoseconds.and there's a reason for it also you.need to turn around the bus from wheat.to right.so clearly have many many of these.reasons this sounds like a mess right.these people have specified individual.timing constraints for every single.combination of first command and the.next command why did they do that why.didn't they say okay I access the app.and I have only a single timing.constraint it takes X cycles and then.I'll wait for X cycle then I can do.anything I want to the DM afterwards.exactly basically if you do what I just.said X needs to be the worst case of all.combination of first command and the.next command correct which means that.that X will be very large which means.that you cannot optimize the latency as.much you can see that the difference.over here this is 50 for example and.this is 7.5 now you know the commands.the difference if it's an activate to.activate you wait for 50 if it's read to.pre-charge you wait for only 7.5 and.that gives that reduces the latency as.opposed to having a single timing.constraint for D and that's why we have.all these timing constraints but of.course there's another question you can.ask why do I have to deal with all of.this mess why don't have why don't I.just send the command to diem and wait.for an acknowledgement for my request.you need to wait for an acknowledgement.yes you there's an handshake protocol.that you need to go through you're right.although I'm not sure if the answer is.there that's ways as simple as that I.think we need to really really mean this.age so basically if you think about this.is a very synchronous way of this is.that's why the existing DMS are.synchronous the ends.they're called synchronous TM this is a.very synchronous way of communication.the memory control exactly knows days on.some specification how long it needs to.wait between two different things that.it can do and it's it's very well.synchronized it's an interface that is.dictated by the data sheets and the.memory control is designed for that and.everything is dictated that way so the.memory controller takes into account all.these parameters but there's a complete.different way of designing memory and.memory controllers which is asynchronous.and the idea is you get rid of all of.these the interface is very simple the.memory controller sends a command or.let's say a high-level command to read.write request and the memory responds.when it's ready right America tour has.no idea how long something will take.it'll take as long as it takes and then.it'll be done it will assume that the.command is done when the memory responds.right that's the idea yes.exactly so maybe maybe there's an.internal memory control on the DM side.that deals with that yes that's a very.good now you're asking the right.questions that's good do you put it.inside the DM or do you put it inside a.logic layer very close to the year so.the whole that taught a little bit.because so there's also some evolution.over here which is really interesting.the memory controller has been the.domain of something outside the DM for a.long time it's logic right the logic is.doing all of this scheduling so maybe.synchronous TM if you actually have all.of these latencies you can do the.reordering very very nicely and then the.memory is something kind of dump but the.memory is not fully dumped if it has.some logic inside it or underneath it.then maybe this other controller that's.on the processor side can be much.simpler right so your point to complete.the correct somebody needs to do this.reordering if you want to maximize.performance the questions will need to.do it yes.yeah I think I think you two have.similar points actually basically can.schedule smarter because you know all of.this but the questions who should know.all of that right yeah okay but keep.these things in mind I think I think.thinking about asynchronous diems may be.a good idea going forward but you're.pushing the complexity somewhere else.potentially if you may if you're making.the interface eight asynchronous but if.that complexity can be implemented.inside a DM chip or inside a DM stack.maybe that's the right thing to do it.then that interface could be good for.processing inside the memory also.because now you know your interface.doesn't need to be command based it can.be packet based right you can send a.packet to memory and when the memory.decodes that packet it figures out oh I.have a function to execute right and.that function may really consist of.multiple different commands as well.right okay yes.so it depends on who has more.information I think right so today.because you cannot communicate a lot of.information you may actually not have a.lot of benefit from this REO and you.will have I'm not saying this is.completely bad right this is something.to be rethought off this interface has.been assumed for a long time it has not.been like this in the past it was.asynchronous today it's time to be.potentially rethink because the.complexity is getting out of hand and we.actually have new technologies where you.can have memory controllers on the other.side close to dia yes I'll take one more.in so it's just it could be exactly the.same mechanism right so today somebody.guarantees that these operations.complete internally the memory also can.have internal timings right but also it.could it could decide it could do all of.the optimizations that we talked about.in terms of adaptive latency internally.it as opposed to having something else.that's outside the memory trying to.figure that out.exactly exactly yeah okay okay so now.you know why you have all these.different types of timing constraints.because it didn't ables you talked to my.performance but also keep in mind that.this may not be the best way of.designing going forward okay and there.are a bunch of papers that also talk.about the reasons what you've seen this.also and keep in mind that this the.memory controller design is actually.becoming more difficult if you look into.a system a system is not just CPUs it's.also GPUs hardware accelerators IO.devices this has lots of accelerators in.it machine learning accelerators they.all go through the memory controller.basically and memory is also becoming.complicated.we have hybrid memories so have.heterogeneous agents heterogeneous.memories and need to design a memory.controller multiple memory controllers.that try to reduce the interference.between these agents and try to satisfy.all the constraints of different.memories and you have many many goals at.the same time performance fairness.quality of service energy efficiency dot.dot we're gonna focus on just.performance for now actually for the.rest of this part of the lecture even.performs itself net is not easy actually.assuming you want to do get the highest.performance as well or obey all the.constraints set because you need to obey.the constraints to make sure things work.I'll give you the story that I wanted to.give you earlier but does anybody know.the name Chuck tacker here no one no one.old enough.we haven't has anybody heard about the.the Xerox Alto system for example these.are the early personal computers in.1960s late 1960s Jack tacker was a key.designer System Architect of the Xerox.Alto system it's early PC he he won the.Turing award in 2009 and he because of.the system design he did basically he's.an extremely good designer and I got to.know him while I was at Microsoft.Research because he was also also at.Microsoft Research and at the time I was.working on memory controllers at at the.architecture level he was also designing.memory controllers for the FPGA engines.that he was very interested at that time.and these were simply ddr3 memory.controllers I would call them simpler.than some other memory controllers and.this designer who.the early personal computers that a lot.of system-level work he said that this.memory control is the worst thing I've.designed in my life basically he took a.lot of pains to design that memory.control because of everything we.discussed basically so he thought it's a.mess and think about this coming from.essentially a Turing Award winner who's.designed many early systems and those.early systems are not all so simple we.need to really make sure that personal.computer works you need to do the.earliest virtual memory mechanisms if.you think about it right they're not.simple things but they're apparently.according to them they were not as messy.as a memory controllers so that's that.gives you a perspective in terms of what.a memory controller is like and I agree.with that actually I think a memory.controller is a terrible thing to design.today as a result we end up with very.simple policies so because of that we.were at the same time thinking of this.so the reality and this is obvious I.think if you ever design a memory.controller it'll hit you right away it's.basically difficult it's difficult to.design a policy that maximizes.performance even ignoring quality of.services energy efficiency let's stop at.performance if you put other things on.top of it it becomes even worse so.basically there are too many things to.think about I've given you a bunch of.these things and I think it's not.humanly possible to design the best.policy if you want to do this and yeah.especially if you want to adapt to.continuously changing workloads and.system behavior right because imagine.you're designing this memory controller.and that memory controller is going to.execute workload X now but later it's.going to execute workload Y and.sometimes it's going to work execute.combination right I hope you design a.policy that fits all of that right so as.a result what you end up with is very.simple policies this memory controller.is doing first really first come first.serve a variant of it since the.beginning of its life and it doesn't.learn anything basically it's it's.Dumba's whatever you see is dumped right.it's very bad basically it's I don't.know how its operating I think it's.operating for five years maybe I have it.for five years or so and it's still.doing the same policy let's quit the.question doesn't really make sense to.design a system that is like this today.in 2019.keep that in mind so that was our dream.basically wouldn't it be nice if the.memory controller actually found a good.scheduling policy of its on its own as.opposed to human dictating the policy.because I don't think this is really.humanly possible this is a place where.we really need to realize humans.boundaries probably okay so basically we.want to just we want to just look at the.performance function of the memory.controller you have a memory controller.it's resolves memory contention by.scheduling the requests the key question.is how do we schedule these requests to.maximize system performance and this was.our idea we call the self optimizing VM.control we did this work in 2007 or so.and I've already given you the problem.basically the amplitudes are difficult.to design is difficult for human.designers to come up with a policy that.can adapt itself very well to many.workloads and different system.conditions even within the same workload.actually you don't even need to think.about different workers even within the.same workload it's using the exact same.policy over time so our idea was to have.a memory control that adapted scheduling.policy to workload behavior and system.conditions using machine learning and.the observation we had I think was nice.basically we saw that reinforcement.learning Maps very nicely to memory.control the enforcement learning is a.very basic learning mechanism actually.all of us are you want reinforcement.learning agents right now let's we'll.talk about that also but basically our.design must have design my memory.controls every enforcement learning.agent and what is this agent do it.dynamically and continuously learns and.employ the best scheduling policy to.maximize long-term performance of course.there's a lot that needs to go into it.which we will briefly talk about I'm not.going to go into detail but I will have.you read the paper for your next.homework so what's very important of.learning agents as I said all of us are.reinforcement learning agents actually.all living beings are enforcement.learning agents you're basically an.agent that interact with the environment.you observe some state in the.environment and you take an you take.some action and you get some reward and.over time assuming that you want to.maximize that reward you try to choose.the actions that maximize the reward at.a given state that's the idea it's very.simple.right it's actually a lot of.fundamentals of behavioral psychology is.based on this if you know the name BF.Skinner he's really the father of.reinforcement learning in psychology or.behaviorism these P he showed that you.could educate the mice right the mice.you put in a box the mouse you put in a.box and the mouse somehow there's a.lever the mouse somehow randomly figures.out that when it presses the lever it.gets fruit so it gets reinforced right.so it takes an action in a given state.and it gets a reward and it figures out.it makes the connection between pressing.the lever and getting the reward and.keeps doing that to maximize its reward.right that's positive reinforcement.there's also negative reinforcement you.don't get a reward you get a punishment.right if you do something that doesn't.really if you wander around for example.the mouse maybe you get electro-shocked.he did he add this box actually it's.called the Skinner box if you're.interested in looking at it so it's.basically reinforcement learning people.people and animals have this principle.right if it's a hot stove right you have.you have your hand you put it in a hot.stove you get negative reinforcement.right you get a punishment negative.reward basically so memory control is.actually mapped nicely to this also.actually you can model this action as a.Markov decision process and.reinforcement learning in terms of.statistics people have shown that the.enforcement learning best if it works.best if the if the state and the.environment can be expressed as a Markov.is if the states to be space can be.expressed as a Markov decision process.you base be in this state and with some.probability if you take some action with.some probability you go to some other.state that's a Markov decision process.and you can prove a lot of things about.the enforcement learning if you have a.system that obeys this Markov decision.process principle I'm not going to go.into the details that's more theoretical.aspects or machine learning clearly but.the paper point you to enough references.talking about this it turns out many.scheduler if you constrain the problem.to.performance and performance specifically.database utilization is really a Markov.decision process basically a scheduler.observes some state and it takes an.action schedule the command and that.action leads to some reward and reward.can be expressed in terms of data bus.utilization is it utilized or not and.over time it's records this information.it figures out which states which action.at a given state leads to the maximum.data bus utilization over time okay.that's the idea of course the goal is.not to immediately maximize data bus.utilization the goal is to maximize.long-term data bus utilization so this.reward function is really important so.specification of that reward function.how do you update the rewards that you.get is really important I'm not going to.go into the detail but there's basically.you get some reward at time zero at time.one at time two at time 3 at time dot.dot you kind of wait them somehow with.some waiting function this is one simple.waiting function so you learn to.maximize longer-term rewards for a given.state action pair let's say that's the.idea of course for this you need to have.some tables that record your state and.that the court let record the.correlation between the state action.pairs and the learned rewards and then.the next time you're and that's in a in.a similar state let's say because you.want to generalize also you don't want.to be extremely specific generalization.is very important actually for learning.in general you don't want to you don't.want to observe exactly the same.conditions right to take the action that.maximizes your reward you want to really.try to generalize to similar conditions.and how do you do that is very very.interesting also in your enforcement.learning but basically if you observe a.similar state you know now okay you have.all these choice of actions and they all.lead to different reward values you pick.the action that maximizes your reward in.the long term that's the idea that's.assuming you've learned all of this in.your tables nicely right ok so basically.that's the idea the paper has more.detail and I'll refer you to the paper.you associate system states and actions.with long term reward values each action.at a given state learn listo.reward and you scheduled the command.with highest estimated long-term reward.value in each state at a given state and.you continuously update the reward.values for the state action pairs based.on feedback from the system clearly you.see what rewards you gotten in the end.cycles since you scheduled this command.at that given state at that given state.action pair basically okay so clearly.this is more complicated than fr FCFS.it's going to cost you something that's.also going to give you benefit so the.paper describes a bunch of stuff and you.can read it basically states come from.the state of the system actions are.actions taken by the scheduler actions.are easier to define clearly rewards is.very important this is a report once you.need to be very careful about all states.are also very important actually I think.actions are the easiest part over here.because you know what kind of actions.are available to you as a memory.scheduler you need to figure out which.state attributes you you observe because.if you actually look at all of the state.attributes there could be millions.clearly writing a system that increases.your table size is huge and that reduces.your learning rate learning speed as.well and reward is very important.because if you don't specify the reward.function correctly then you may be.actually optimizing for something that.you don't want let's take a look at what.we did basically we wanted to maximize.database utilization so our reward.function was very simple we got a.plus-one reward for scheduling read and.write commands and zero reward for all.other actions all other comments clear.this is this makes sense right of course.there's also an update function for the.reward which I showed you with the with.the gammas and ours that update function.is also important state attributes this.is the sighting can be automated.actually there's a lot of work in.machine learning today that tries to.automate the discovery of which state.attributes should you look at we didn't.do that we did it statically basically.we had this list of more than 400.different types of state attributes that.you can look at in a system and with a.lot of simulation of reinforcement.learning scheduler we tried to.reduce the number of attributes we look.at the six or so in the end of your.deuce to six I think if you count these.at six but basically these turned out to.be the most important state attributes.now this is where the human design needs.to do a lot of work they need to specify.the reward function that's a lot of work.they need to figure out which state.attributes are good to look at for the.machine learning agent this is a lot of.work also clearly you start with some.intuition and you try to try to narrow.the state space right number of reads.number of writes these turned out to be.important number of load misses in the.transaction queue this is the memory.scheduling buffers that turned out to be.important also where this kind of makes.sense right because you want to know how.many reads or how many rights you have.that has some impact on your scheduling.decision number of pending rights that's.also important number of reorder buffer.heads waiting for the reference draw it.turns out to be the stall so important.what does this mean reorder buffer heads.me I mean the oldest instructions in.each core that are waiting for the.reference row that means that that role.is really important at this point right.because there's some instruction that's.waiting for that row that's delaying.blocking the progress of the processor.so clearly you somehow need to think.about the state's potential state.attributes to feed into your mechanism.to find out what state attributes you're.eventually going to use because relative.reorder buffer order this also has some.impact on criticality of the request.right where is it you have 128 entry.instruction window or reorder buffer is.that the top or is that the bottom is.that the top meaning means that it's the.oldest it's at the bottom means that.it's the youngest if it's at the bottom.maybe it's not as important right of.course we don't know this because this.is an en machine learning is learning.some state action pairs and reward.values based on these states but we can.guess why it's select why our state.feature selection mechanism actually.selected these so basically this is.these are the six attribute I think it's.six in the end six attributes that after.future selection are 400 plus attributes.were narrowed down to okay actions as I.said actions are relatively simple.clearly you have some actions to do.we distinguish between loads and stores.load and store misses over here this is.you still have some freedom over here.right because you may want to.distinguish between a read that's.because of a load miss and read that's.because of a loudness store miss and.that's we said that these are actually.different actions there are really not.different actions from the memory.controllers perspective but there really.there are different actions from a.learning perspective potentially because.you may want to actually learn the.different rewards for this write it.turned out that this was actually.important there's also no as you can see.over here and pre-charge we we actually.did two different kinds of peach are.just pre charge pending when there's.really a pre charge that's needed and a.pre-emptive recharge meaning you don't.there's no command that requires a pre.charge but you may want to pre charge.the role right.but you need to actually incorporate.this action because if you don't have.that too you will never preemptively pre.charge right the the learning agent will.never learn to issue a pre charge okay.any other questions ok let's take a look.at some results I guess you're waiting.for the results so basically the results.are good let me put it that way you can.read the paper but it this leads to.large robust performance improvements.over many Human Design policies and we.actually try to use exactly the same.information and automate the exploration.of human possible InDesign policy space.and we found out that we cannot beat.this scheduler I think for example if.you look over here with these workloads.it gets about 19% performs improvement.and the maximum that you can get the.optimistic scheduler is really 70.percent that's the ideal scheduler and I.believe that these results can actually.change depending on the intensity of the.workloads I don't believe that these.were the most intensive workloads we.examined but if you actually this is at.five CFS it's the baseline so you get.19% if you try to optimize that far FCFS.very heavily and go through a state.space exploration do the reordering as.much as possible between using the same.attributes that we found out over here.it turns out you get about 5% more on.top of frf CFS so there's benefits from.online learning that's coming me also.look at online versus offline versions.of this.clear this in online policy right it.changes to the scheduling policy online.but you could also imagine designing an.offline machine learning based memory.controller you found out an offline.scheduling policy and then you bake it.into hardware you never learn over time.it turns out that's better than existing.systems there's not as good as doing.online learning so clearly there's some.benefit that comes from online learning.you adapt to the conditions of the.system and the workload and there's a.lot more details in the paper yes yeah.it's not yeah you could think of it as a.theoretical maximum in the sense that.these we got rid of all the scheduling.constraints except the data bus.conflicts there's no timing constraint.other than data bus timing constraints.so it's an unachievable maximum you.could actually bound it lower but we.want to look at that but I don't think.any scheduler can achieve that actually.any other questions okay.you'll love fun reading this paper it is.interesting okay cool I think there.needs to be a lot more as we will.discuss a little bit so let's talk about.the advanced and disadvantages of this.so I in my opinion there are two major.advantages basically this provides you.continuous learning in the presence of.changing environment let's get rid of.this stupid scheduler let's say right.the scheduler is basically doing the.same thing for five years it hasn't.learned a thing from what it says it has.seen it's seen a lot it's seen a lot of.instructions a lot of memory requests.and has learned nothing right so but.this one is very different right it.learns of course it comes at a hardware.cost the paper analyze the hardware cost.you need a 32 kilobyte buffer inside.there I believe it's implementable so.it's not that bad.actually people have worked on.reinforcement learning in Hardware in.the as early as 1970s we borrowed some.of their implementation.it's called C Mac I don't remember what.C MAX stood for but it's cellular.something err a computer anyway you can.look at it in the paper so basically the.hardware implementation is also not that.although that's going to be a negative.over here.so it also reduces the designer burden.in finding a good scheduling policy.because now the designer has a.higher-level function if you will the.designer doesn't dictate.the policy is the designer figures out.what system maryville might be useful.and it inputs that to a feature.selection process and the designer pride.what target optimized but he or she.doesn't say how to optimize it basically.how to optimize this completely up to.the machine learning agent it's.automatic.I think that's reducing the designer.burden significantly in general and if.you look at these controllers that's.where the burden really is of course.they're down site now right how do you.specify different objectives I think our.objective was actually very simple.database utilization but that's not the.only objective that you have in real.systems clear there's fairness there's.quality of service.there's predictable performance.requirements that you have sometimes.those all need to be incorporated into.the memory scheduler and we're going to.talk about those but not for a machine.learning perspective later on Hardware.complexity I think was an issue.clear this is more complex but you get.need to pay something to get something.and nothing is free as you know already.I think this can be managed and I think.there needs to be more research in terms.of how to actually make this less.complex in general and maybe there needs.to be other algorithms to explore.writing importantly learning is great I.think but there may be other algorithms.that can that you can take advantage of.today and I think maybe the one of the.hardest part is design mindset and flow.so clear this goes against the design.flow that you have in existing systems.right you have a design flaw that they.success policy is specified all of your.testing today assumes that your policy.is specified right how you test.something like this because you don't.know what the results that you're.expecting right whenever whenever you do.test the hardware test today you give.some inputs you know exactly what the.output should be here you don't know no.idea actually.in fact we incorporate randomness into.the scheduling policy because remember.remember the most that I talked about.Mouse actually Skinner's Mouse randomly.figures out to press the lever so.initially you need to do some random.exploration in machine learning there's.always a trade-off between exploration.and exploitation right.you have some policy you learned it we.exploit it but if you keep exploiting.that policy if you don't learn new.things.so you really need to have provisions.for exploration as well and that's why.randomness comes in the way we.incorporated exploration into this.policy is basically once in a while with.very small probability we don't follow.what we just described we don't.basically pick the action that gives you.the highest reward we pick a random.action and that random action enables.you to explore different different.spaces in your in your state action pair.and reward functions basically in that.mapping and that's really critical.basically and once you incorporate that.sort of randomness even if you don't.incorporate that sort of randomness you.don't know what the output will be so.it's very this design mindset also needs.to change because you really design the.controller in a very different way and.perhaps this is the hardest part as we.will we've also talked about in mindset.right earlier ok so I'm going to assign.this for you hopefully you'll have fun.reading it has anybody do machine.learning here before ok let's have you.have you studied reinforcement learning.no ok so if you're actually interested.in studying reinforcement learning.Richard Sutton who is really the person.who's developed a lot of the initial.reinforcement learning algorithm has a.book on it I think it's called.reinforcement learning I don't know I.don't think it has a longer name but.maybe it does and it's also free online.if you go to his web page you get a PDF.and he's updated recently I studied that.book a long time ago which was not.updated in 2018 clearly but it's now.it's now up to date you can you can look.at a lot of reinforcement learning over.there and nicely so he actually uses.this as as one of the successful.applications of reinforcement learning.in recent years it's not that recent as.you can see it's that's already 12 years.or so since we did this work I think.there needs to be a lot more work in.this area going forward so let me pull.back a little I think this is really.thinking about our architecture as a.self optimizing matter and memory.controller even though it's a very good.place to think about it this way.I think there are a lot of other.controllers that we need to think about.and I also call this data-driven right.clearly what we've done is it's self.optimizing over time it's also.data-driven it's.looking at the data it's is trying to.make sense out of the data and form a.policy out of that data that it sees in.this case data are all of the addresses.all of the commands that it schedules.and learning from that data and I think.we need to do more of this so if you.look at system architecture design today.we have mostly human driven design.humans do everything basically including.the testing actually most of the testing.but we're not talking about testing here.humans design the policies they.basically dictate how to do things in.the end the machine is going to execute.them but really humans are designing.them maybe that's not the right way.going forward as a result of this.because humans are not capable of.dealing with very complicated state.spaces you get very simple short-sighted.policies all over the system you cannot.blame the humans for this I think I.think we're all very capable for many.things but maybe not necessarily dealing.with huge amounts of states and you can.see that in memory controllers I first.first ready first-come first-serve is.clearly very simple deciding which row.buffer the whether to keep the row.before open or closed.anyone with prediction mechanism is.still very simple because you design the.prediction mechanism and the prediction.mechanism is fixed but if it was machine.learning base it would be different.there's no automatic data driven policy.learning as a result of this actually.the hardware design it has no machine.learning in it as far as I know there's.some exceptions to it which you will.talk about some branch predictors in.some systems employ perceptrons which.are a very limited form of machine.learning it's a single layer network.basically that's it and some some branch.predictors actually do that and that's a.very good step in the right direction.but that's very simple also but there's.no other controller has automatic data.driven policy learning in fact there's.almost no learning you cannot take.lessons from past actions as we.discussed so the key questions can be.really designed fundamentally.intelligent architectures I think.fundamentally intelligent architectures.mean that you need to design.architectures that are more self.optimizing so what is that I think it.needs to be data-driven as opposed to.human driven machine learns the best.policies and how to do things it doesn't.be it.get dictated by the humans and this.hopefully leads to sophisticated work.both driven changing farsighted policies.basically over the lifetime of five.years you'll learn a lot and you adapt.and you become better and better and.better and better just like humans do.right hopefully we learn from our past.actions and we become better now I don't.know if that that's evidence clearly.politicians don't learn from past.actions so they make life doors and.worse in the world but that may not be.true for the average humans.I think average humans learn quite a bit.okay and this leads to automatic.data-driven policy learning right and.hopefully in the end the all controls.are intelligent data-driven agents right.so they can they have some autonomy if.you will so I think given this we you.really need to rethink the design of all.controllers especially if you really.want to design more intelligent.architectures and actually I think that.really prints a different principle I.gave a talk recently and I promise you.that we're going to talk about to all of.these three you already talked about.this data centric principle which is.really in memory computation for example.that was an example actually low latency.architectures are really data centric if.you really want to be data centric you.really want to treat the data in the.best way and you really want to reduce.the latency and energy of that data.that's that's the data centric now I'm.giving you the data-driven principle.right this is really you designed the.system such that it's data-driven as.opposed to human driven and now you.understand what this means hopefully.we're going to talk about that later.also but basically this this says that.we need to understand what data we're.dealing with and me to adapt the.policies to the different types of data.that they're dealing with what we're.going to talk about that I think all of.these really resemble something like.this more in my opinion but I think you.really need to think it costs to stack I.think what I've shown you here is really.machine learning and architecture.interaction if you will I don't know.where to place machine learning over.here but you really can cut across some.of the parts of the stack over here.especially if you want to incorporate.quality of service here okay so this is.where I want to stop any questions so.we're going to pick up.some number of minutes.

How to generate an electronic signature for the Tr 13a online

CocoSign is a browser based application and can be used on any device with an internet connection. CocoSign has provided its customers with the best method to e-sign their Tr 13a.

It offers an all in one package including validity, convenience and efficiency. Follow these instructions to put a signature to a form online:

  1. Confirm you have a good internet connection.
  2. Open the document which needs to be electronically signed.
  3. Select the option of "My Signature” and click it.
  4. You will be given alternative after clicking 'My Signature'. You can choose your uploaded signature.
  5. Design your e-signature and click 'Ok'.
  6. Press "Done".

You have successfully signed PDF online . You can access your form and email it. Excepting the e-sign alternative CocoSign proffer features, such as add field, invite to sign, combine documents, etc.

How to create an electronic signature for the Tr 13a in Chrome

Google Chrome is one of the most handy browsers around the world, due to the accessibility of a lot of tools and extensions. Understanding the dire need of users, CocoSign is available as an extension to its users. It can be downloaded through the Google Chrome Web Store.

Follow these easy instructions to design an e-signature for your form in Google Chrome:

  1. Navigate to the Web Store of Chrome and in the search CocoSign.
  2. In the search result, press the option of 'Add'.
  3. Now, sign in to your registered Google account.
  4. Access to the link of the document and click the option 'Open in e-sign'.
  5. Press the option of 'My Signature'.
  6. Design your signature and put it in the document where you pick.

After putting your e-sign, email your document or share with your team members. Also, CocoSign proffer its users the options to merge PDFs and add more than one signee.

How to create an electronic signature for the Tr 13a in Gmail?

In these days, businesses have transitted their way and evolved to being paperless. This involves the signing contract through emails. You can easily e-sign the Tr 13a without logging out of your Gmail account.

Follow the instructions below:

  1. Look for the CocoSign extension from Google Chrome Web store.
  2. Open the document that needs to be e-signed.
  3. Press the "Sign” option and design your signature.
  4. Press 'Done' and your signed document will be attached to your draft mail produced by the e-signature application of CocoSign.

The extension of CocoSign has made your life much easier. Try it today!

How to create an e-signature for the Tr 13a straight from your smartphone?

Smartphones have substantially replaced the PCs and laptops in the past 10 years. In order to made your life much easier, CocoSign give assistance to flexible your workflow via your personal mobile.

A good internet connection is all you need on your mobile and you can e-sign your Tr 13a using the tap of your finger. Follow the instructions below:

  1. Navigate to the website of CocoSign and create an account.
  2. Follow this, click and upload the document that you need to get e-signed.
  3. Press the "My signature" option.
  4. Draw and apply your signature to the document.
  5. View the document and tap 'Done'.

It takes you in an instant to put an e-signature to the Tr 13a from your mobile. Load or share your form as you wish.

How to create an e-signature for the Tr 13a on iOS?

The iOS users would be gratified to know that CocoSign proffer an iOS app to make convenience to them. If an iOS user needs to e-sign the Tr 13a, make use of the CocoSign application relivedly.

Here's advice put an electronic signature for the Tr 13a on iOS:

  1. Place the application from Apple Store.
  2. Register for an account either by your email address or via social account of Facebook or Google.
  3. Upload the document that needs to be signed.
  4. Select the section where you want to sign and press the option 'Insert Signature'.
  5. Type your signature as you prefer and place it in the document.
  6. You can email it or upload the document on the Cloud.

How to create an electronic signature for the Tr 13a on Android?

The giant popularity of Android phones users has given rise to the development of CocoSign for Android. You can place the application for your Android phone from Google Play Store.

You can put an e-signature for Tr 13a on Android following these instructions:

  1. Login to the CocoSign account through email address, Facebook or Google account.
  2. Open your PDF file that needs to be signed electronically by clicking on the "+” icon.
  3. Navigate to the section where you need to put your signature and design it in a pop up window.
  4. Finalize and adjust it by clicking the '✓' symbol.
  5. Save the changes.
  6. Load and share your document, as desired.

Get CocoSign today to make convenience to your business operation and save yourself a lot time and energy by signing your Tr 13a online.

Tr 13a FAQs

Here you can acquire solutions to the most popular questions about Tr 13a. If you have specific doubts, press 'Contact Us' at the top of the site.

Need help? Contact support

How can I fill out Google's intern host matching form to optimize my chances of receiving a match?

I was selected for a summer internship 2016. I tried to be very open while filling the preference form: I choose many products as my favorite products and I said I'm open about the team I want to join. I even was very open in the location and start date to get host matching interviews (I negotiated the start date in the interview until both me and my host were happy.) You could ask your recruiter to review your form (there are very cool and could help you a lot since they have a bigger experience). Do a search on the potential team. Before the interviews, try to find smart question that you are Continue Reading

How do I fill out the form of DU CIC? I couldn't find the link to fill out the form.

Just register on the admission portal and during registration you will get an option for the entrance based course. Just register there. There is no separate form for DU CIC.

How can you fill out the W-8BEN form (no tax treaty)?

If there is no tax treaty between your country of residency and the USA, complete only Part I of Form W-8BEN. Part II applies to residents of a tax treaty country. Affirm the Certification in Part III by signing your name and placing a date as required. And you’re done.

Can I register a car without a title?

Here are some ways:- 1. The details you will need to get your auto titled vary from state to state. 2. If you have a car without a name, contact the previous keeper to determine whether they still have the title. 3. The easiest and quickest way to get a vehicle title is by finding the previous owner and going over the necessary paperwork together.

How do you know if you need to fill out a 1099 form?

It can also be that he used the wrong form and will still be deducting taxes as he should be. Using the wrong form and doing the right thing isnt exactly a federal offense

Can you drive a car without a title?

You should be able to renew the registration and get the new title at the same time. If the title is in another person’s name, they will have to go with you.

How do I get a salvage title cleared in Michigan?

depends where you live. in the UK you cant get it off, and come categories mean the car can never be put back on the road no matter how much its rebuilt

Easier, Quicker, Safer eSignature Solution for SMBs and Professionals

No credit card required14 days free